Java8、在流中使用.parallel导致OOM错误

Java 8, using .parallel in a stream causes OOM error

在书 Java 8 In Action, section 7.1.1 中,作者指出流可以通过添加函数 .parallel().他们提供了一个名为 parallelSum(int) 的简单方法来说明这一点。我很好奇它的效果如何,所以我执行了这段代码:

package lambdasinaction.chap7;

import java.util.stream.Stream;

public class ParallelPlay {

    public static void main(String[] args) {
        System.out.println(parallelSum(100_000_000));
    }

    public static long parallelSum(long n) {
        return Stream.iterate(1L, i -> i + 1)
                .limit(n)
                .parallel()
                .reduce(0L, Long::sum);
    }
}

令我惊讶的是,我收到了这个错误:

Exception in thread "main" java.lang.OutOfMemoryError
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at java.util.concurrent.ForkJoinTask.getThrowableException(Unknown Source)
    at java.util.concurrent.ForkJoinTask.reportException(Unknown Source)
    at java.util.concurrent.ForkJoinTask.invoke(Unknown Source)
    at java.util.stream.SliceOps.opEvaluateParallelLazy(Unknown Source)
    at java.util.stream.AbstractPipeline.sourceSpliterator(Unknown Source)
    at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
    at java.util.stream.ReferencePipeline.reduce(Unknown Source)
    at lambdasinaction.chap7.ParallelPlay.parallelSum(ParallelPlay.java:15)
    at lambdasinaction.chap7.ParallelPlay.main(ParallelPlay.java:8)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.stream.SpinedBuffer.ensureCapacity(Unknown Source)
    at java.util.stream.Nodes$SpinedNodeBuilder.begin(Unknown Source)
    at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
    at java.util.stream.SliceOps$SliceTask.doLeaf(Unknown Source)
    at java.util.stream.SliceOps$SliceTask.doLeaf(Unknown Source)
    at java.util.stream.AbstractShortCircuitTask.compute(Unknown Source)
    at java.util.concurrent.CountedCompleter.exec(Unknown Source)
    at java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(Unknown Source)
    at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
    at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)

我是 运行 Java 1.8.0_45 Windows 7, SP1 四核处理器。怎么回事?

这里你创建一个无限流然后限制它。并行处理无限流存在已知问题。特别是没有办法有效地将任务分成相等的部分。在内部使用了一些启发式方法,这些方法并不适合每项任务。在您的情况下,最好使用 LongStream.range:

创建有限流
import java.util.stream.LongStream;

public class ParallelPlay {

    public static void main(String[] args) {
        System.out.println(parallelSum(100_000_000));
    }

    public static long parallelSum(long n) {
        return LongStream.rangeClosed(1, n).parallel().sum();
    }
}

在这种情况下,流引擎从一开始就知道你有多少元素,因此它可以有效地拆分任务。另请注意,使用 LongStream 更有效,因为您不会有不必要的装箱。

如果您可以用有限流解决您的任务,通常会避免使用无限流。