java.util.stream.Collectors: 为什么summingInt用数组实现？

Question

标准 Collector summingInt 在内部创建一个长度为 1 的数组：

public static <T> Collector<T, ?, Integer>
summingInt(ToIntFunction<? super T> mapper) {
    return new CollectorImpl<>(
            () -> new int[1],
            (a, t) -> { a[0] += mapper.applyAsInt(t); },
            (a, b) -> { a[0] += b[0]; return a; },
            a -> a[0], CH_NOID);
}

我想知道是否可以只定义：

private <T> Collector<T, Integer, Integer> summingInt(ToIntFunction<? super T> mapper) {
    return Collector.of(
            () -> 0,
            (a, t) -> a += mapper.applyAsInt(t),
            (a, b) -> a += b,
            a -> a
    );
}

但这不起作用，因为累加器似乎被忽略了。谁能解释这种行为？

Answer 1

Integer 是不可变的，而 Integer[] 数组是可变的。累加器应该是有状态的。

假设您有 2 个对 2 个 Integer 对象的引用。

Integer a = 1;
Integer b = 2;

从本质上讲，您所指的实例是不可变的：它们一旦创建就无法修改。

Integer a = 1;  // {Integer@479}
Integer b = 2;  // {Integer@480}

您已决定使用 a 作为累加器。

a += b;

目前持有的价值a让您满意。是 3。但是，a 不再是指您在开始时使用的那个 {Integer@479}。

我在你的 Collector 中添加了调试语句，让事情变得清晰。

public static  <T> Collector<T, Integer, Integer> summingInt(ToIntFunction<? super T> mapper) {
  return Collector.of(
      () -> {
        Integer zero = 0;
        System.out.printf("init [%d (%d)]\n", zero, System.identityHashCode(zero));
        return zero;
      },
      (a, t) -> {
        System.out.printf("-> accumulate [%d (%d)]\n", a, System.identityHashCode(a));
        a += mapper.applyAsInt(t);
        System.out.printf("<- accumulate [%d (%d)]\n", a, System.identityHashCode(a));
      },
      (a, b) -> a += b,
      a -> a
  );
}

如果你使用它，你会注意到像

这样的模式

init [0 (6566818)]
-> accumulate [0 (6566818)]
<- accumulate [1 (1029991479)]
-> accumulate [0 (6566818)]
<- accumulate [2 (1104106489)]
-> accumulate [0 (6566818)]
<- accumulate [3 (94438417)]

尽管 += 的所有失败尝试都没有更改 0 (6566818)。

如果您将其重写为使用 AtomicInteger

public static  <T> Collector<T, AtomicInteger, AtomicInteger> summingInt(ToIntFunction<? super T> mapper) {
  return Collector.of(
      () -> {
        AtomicInteger zero = new AtomicInteger();
        System.out.printf("init [%d (%d)]\n", zero.get(), System.identityHashCode(zero));
        return zero;
      },
      (a, t) -> {
        System.out.printf("-> accumulate [%d (%d)]\n", a.get(), System.identityHashCode(a));
        a.addAndGet(mapper.applyAsInt(t));
        System.out.printf("<- accumulate [%d (%d)]\n", a.get(), System.identityHashCode(a));
      },
      (a, b) -> { a.addAndGet(b.get()); return a;}
  );
}

您会看到一个真正的累加器（作为 mutable reduction 的一部分）在运行

init [0 (1494279232)]
-> accumulate [0 (1494279232)]
<- accumulate [1 (1494279232)]
-> accumulate [1 (1494279232)]
<- accumulate [3 (1494279232)]
-> accumulate [3 (1494279232)]
<- accumulate [6 (1494279232)]

java.util.stream.Collectors: 为什么summingInt用数组实现？

java.util.stream.Collectors: Why is the summingInt implemented with an array?

java

java-8

java-stream

collectors