java 如何区分 apache beam 中 KV 实例中的两个键？

Question

apache beam 的版本是 2.15.0 .

在此code中，class Airport用作KV实例的Key，最后，mean 是为每个 Airport 实例计算的。

c.output(KV.of(stats.airport, stats.timestamp));

但是 apache beam 如何在内部比较两个键和 return 两个实例是否相同？如果所有 class 成员都具有相同的值，那么两个实例是否被相同对待？ Document 没有提到两个键的比较。

如果有人能帮助我理解，我将不胜感激。

Answer 1

这实际上在 GroupByKey 转换 docs 中进行了解释，这是在后台为 Mean 聚合完成的操作：

Two keys of type K are compared for equality not by regular Java Object.equals(java.lang.Object), but instead by first encoding each of the keys using the Coder of the keys of the input PCollection, and then comparing the encoded bytes. This admits efficient parallel evaluation. Note that this requires that the Coder of the keys be deterministic (see Coder.verifyDeterministic()). If the key Coder is not deterministic, an exception is thrown at pipeline construction time.

请注意，Mean uses Combine.PerKey 是 GroupByKey + Combine.GroupedValues 的 'shorthand'。

java 如何区分 apache beam 中 KV 实例中的两个键？

How does java differentiate two keys in KV instance in apache beam?

java

apache-beam