apache beam 中的管道语法是如何实现的？

Question

正在学习apache beam，出于好奇，想请教以下问题

事先，我已经阅读了以下文档和主题。

https://beam.apache.org/documentation/programming-guide/#applying-transforms

我知道 pipe(|) 是 java 的 .apply 的 python 版本。但是，我很想知道 python 如何将 __or__ 运算符解释为处理从左到右经过的每个 pcollection 元素的处理器。

如果有人能教育我并指出代码参考，我将不胜感激。

谢谢，于

Answer 1

我想将@Kolban 的回复标记为答案。

I did a Google search on "python operator overloading" and found a bunch of good references that seem likely. Searching the Github repository, it looks likely that this may be the actual code: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/ptransform.py#L470

Answer 2

它通过 operator overloading:

def __or__(self, right):
  """Used to compose PTransforms, e.g., ptransform1 | ptransform2."""
  if isinstance(right, PTransform):
    return _ChainedPTransform(self, right)
  return NotImplemented

管道 (|) 用于组成 PTransforms，例如 ptransform1 | ptransform2.

apache beam 中的管道语法是如何实现的？

How is pipe syntax implemented in apache beam?

google-cloud-platform

apache-beam