Collectors.toMap() 和 Collectors.groupingBy() 之间收集到 Map 的区别

Question

我想从 Points 的 List 创建一个 Map 并在映射中将列表中的所有条目映射到相同的 parentId，例如 Map<Long, List<Point>> .
我使用了 Collectors.toMap() 但它没有编译：

Map<Long, List<Point>> pointByParentId = chargePoints.stream()
    .collect(Collectors.toMap(Point::getParentId, c -> c));

Answer 1

Collectors.groupingBy 正是您想要的，它从您的输入集合创建一个地图，使用您为其提供的 Function 键创建一个条目，以及一个带有关联键的点列表因为它的价值。

Map<Long, List<Point>> pointByParentId = chargePoints.stream()
    .collect(Collectors.groupingBy(Point::getParentId));

Answer 2

下面的代码完成了这些工作。 Collectors.toList() 是默认的，所以你可以跳过它，但如果你想要 Map<Long, Set<Point>> 就需要 Collectors.toSet()。

Map<Long, List<Point>> map = pointList.stream()
                .collect(Collectors.groupingBy(Point::getParentId, Collectors.toList()));

Answer 3

TLDR :

要通过键 (Map<MyKey,MyObject>) 收集到包含单个值的 Map，请使用 Collectors.toMap()。
要通过键 (Map<MyKey, List<MyObject>>) 收集到包含多个值的 Map，请使用 Collectors.groupingBy().

Collectors.toMap()

通过写作：

chargePoints.stream().collect(Collectors.toMap(Point::getParentId, c -> c));

returned 对象将具有 Map<Long,Point> 类型。
查看您正在使用的 Collectors.toMap() 函数：

Collector<T, ?, Map<K,U>> toMap(Function<? super T, ? extends K> keyMapper,
                                    Function<? super T, ? extends U> valueMapper)

它 return 是一个 Collector 结果 Map<K,U> 其中 K 和 U 是两个函数的 return 的类型传递给方法。在你的例子中，Point::getParentId 是一个 Long 而 c 指的是一个 Point。而 Map<Long,Point> returned 当 collect() 被应用时。

并且这种行为是意料之中的，因为 Collectors.toMap() javadoc 指出：

returns a Collector that accumulates elements into a Map whose keys and values are the result of applying the provided mapping functions to the input elements.

但是如果映射的键包含重复项（根据 Object.equals(Object)），则会抛出 IllegalStateException
这可能是您的情况，因为您将根据特定的属性对 Point 进行分组：parentId。

如果映射键可能有重复项，您可以使用 toMap(Function, Function, BinaryOperator) 重载，但它不会真正解决您的问题，因为它不会将具有相同 parentId 的元素分组。它只是提供一种方法，使两个元素不具有相同的 parentId。

Collectors.groupingBy()

为了达到您的要求，您应该使用 Collectors.groupingBy()，它的行为和方法声明更适合您的需要：

public static <T, K> Collector<T, ?, Map<K, List<T>>>
groupingBy(Function<? super T, ? extends K> classifier)

指定为：

Returns a Collector implementing a "group by" operation on input elements of type T, grouping elements according to a classification function, and returning the results in a Map.

该方法采用 Function。
在您的情况下，Function 参数是 Point（Stream 的 type）并且您 return Point.getParentId() 因为您希望通过 [=38= 对元素进行分组] 值。

所以你可以这样写：

Map<Long, List<Point>> pointByParentId = 
                       chargePoints.stream()
                                   .collect(Collectors.groupingBy( p -> p.getParentId()));

或使用方法参考：

Map<Long, List<Point>> pointByParentId = 
                       chargePoints.stream()
                                   .collect(Collectors.groupingBy(Point::getParentId));

Collectors.groupingBy() ：更进一步

确实，groupingBy() 收集器比实际示例走得更远。 Collectors.groupingBy(Function<? super T, ? extends K> classifier) 方法最终只是一种方便的方法，可以将收集到的 Map 的值存储在 List.
中要将 Map 的值存储在 List 以外的其他事物中或存储特定计算的结果，groupingBy(Function<? super T, ? extends K> classifier, Collector<? super T, A, D> downstream) 应该会让您感兴趣。

例如：

Map<Long, Set<Point>> pointByParentId = 
                       chargePoints.stream()
                                   .collect(Collectors.groupingBy(Point::getParentId, toSet()));

因此，除了提出的问题之外，您还应该将 groupingBy() 视为一种灵活的方式来选择要存储到收集的 Map 中的值，而 toMap() 显然不是。

Answer 4

通常情况下，从 object.field 到共享此字段的对象集合的映射最好存储在 Multimap 中（Guava 有一个很好的 multimap 实现）。如果您不需要可变的多图（这应该是理想的情况），您可以使用

Multimaps.index(chargePoints, Point::getParentId);

如果您必须使用可变映射，您可以实施收集器（如此处所示：https://blog.jayway.com/2014/09/29/java-8-collector-for-gauvas-linkedhashmultimap/）或使用 for 循环（或 forEach）来填充空的可变多映射。

当您使用从字段到共享字段的对象集合（例如对象总数）的映射时，多映射可为您提供通常需要的附加功能。

可变多图还可以更轻松地向地图添加和删除元素（无需担心边缘情况）。

Collectors.toMap() 和 Collectors.groupingBy() 之间收集到 Map 的区别

Differences between Collectors.toMap() and Collectors.groupingBy() to collect into a Map

java

java-8

collectors