在 java.util.stream.Stream 接口的两个 collect 方法中，其中一个构造不佳吗？

Question

在java.util.stream.Stream界面中，

<R> R collect(Supplier<R> supplier,
              BiConsumer<R, ? super T> accumulator,
              BiConsumer<R, R> combiner);

组合器是 BiConsumer<R, R>，而在

<R, A> R collect(Collector<? super T, A, R> collector);

组合器是一个 BinaryOperator<A>，它只不过是一个 BiFunction<A,A,A>。

虽然后一种形式清楚地定义了组合后组合对象的引用，但前一种形式没有。

那么任何 Stream 实现库如何知道前一种情况下的组合对象是什么？

Answer 1

collect方法应该这样使用：

ArrayList<Integer> collected = Stream.of(1,2,3)
    .collect(
        ArrayList::new, 
        ArrayList::add, 
        ArrayList::addAll);
System.out.println(collected);

第一个参数是供应商，它提供一个空数组列表，用于将收集的东西添加到其中。第二个参数是一个双消费者，它消耗数组的每个元素。第三个参数仅用于提供并行性支持。这使它能够同时将元素收集到多个数组列表中，并且它要求您在最后将所有这些数组列表连接在一起。

为什么 collect 知道组合的结果，如果你不知道 return 带有添加项的数组列表？嗯，这是因为 ArrayList 是可变的。在实现的某个地方，它调用 accumulator.accept:

// not real code, for demonstration purposes only
accumulator.accept(someArrayList, theNextElement);

someArrayList 将保留在 accept returns!

之后对其所做的所有更改

让我们把它放到一个更熟悉的场景中，

ArrayList<Integer> list = new ArrayList(Arrays.asList(1,2,3));
doSomething(list);
System.out.println(list); // [1, 2, 3, 4]

private static void doSomething(ArrayList<Integer> list) {
    list.add(4);
}

即使 doSomething 不是 return 一个新的数组列表，list 仍然是变异的。 BiConsumer.accept 也会发生同样的事情。这会导致 collect 到 "know" 您对数组列表所做的操作。

Answer 2

combiner 仅用于并行流合并线程中计算的 2 个结果。

实际上，流使用Consumer 来累积结果来自线程。 result 保存在 Consumer 中，最后结合另一个 Consumer.

的部分结果

对于BinaryOperator组合器更像下面的代码：

T[] partials = the result is computed in threads...
T result = supplier.get();
for (T partial : partials)
     result = combiner.apply(result, partial)
return result;

对于BiConsumer组合器更像下面的代码：

T[] partials = the result is computed in threads...
T result = supplier.get();
for (T partial : partials)
     combiner.accept(result, partial)
return result;

来自流包 description :

As with reduce(), a benefit of expressing collect in this abstract way is that it is directly amenable to parallelization: we can accumulate partial results in parallel and then combine them, so long as the accumulation and combining functions satisfy the appropriate requirements. For example, to collect the String representations of the elements in a stream into an ArrayList, we could write the obvious sequential for-each form:

 ArrayList<String> strings = new ArrayList<>();
 for (T element : stream) {
     strings.add(element.toString());
 }

或者我们可以使用可并行化的收集形式：

 ArrayList<String> strings = stream.collect(() -> new ArrayList<>(),
                                            (c, e) -> c.add(e.toString()),
                                            (c1, c2) -> c1.addAll(c2));
//  the requirements showing as an example           ---^

Answer 3

在Java9中，Stream.collect(Supplier, BiConsumer, BiConsumer)方法的文档已经更新，现在它明确提到您应该将第二个结果容器中的元素折叠到第一个结果容器中：

combiner - an associative, non-interfering, stateless function that accepts two partial result containers and merges them, which must be compatible with the accumulator function. The combiner function must fold the elements from the second result container into the first result container.

（强调我的）。

在 java.util.stream.Stream 接口的两个 collect 方法中，其中一个构造不佳吗？

Out of the java.util.stream.Stream interfaces's two collect methods, is one of them poorly constructed?

java

java-8

java-stream

collectors