停止执行管道转换，而其他管道转换保持运行

Stop executing a pipeline transform while other pipeline transforms keep running

我在 google 存储中有许多文件，在应用我尝试使用单个管道执行的简单 ParDo 转换后，我必须将这些文件写入 BigQuery 中的多个 table。所以基本上我有许多并行未连接的源和汇运行一个数据流作业中的单个管道。在 Pardo 转换中，我有一个条件，如果它的计算结果为真，则写入特定 BigQuery table(transform) 必须停止，同时写入其他 BigQuery tables(other transforms) 保持为通常。

在此图像中，有 2 个并行源和 2 个并行接收器，由于日期 2014-08-01 中的一些错误数据，第一次转换失败。一旦 2014-08-01 转换失败，2014-08-02 转换就会被取消。 2014-08-02 转换没有坏数据。

有没有办法防止取消另一个转换？

目前在 Dataflow 服务中，整个管道要么成功要么失败，任何失败都会取消管道的其余部分。没有办法改变这种行为；如果你想让它们分别成功或失败，你需要运行分开管道。

请注意，在操作上，您可以运行来自同一个 Java 主程序的两个管道；只需创建两个不同的 Pipeline 对象并分别调用运行() 即可。

停止执行管道转换，而其他管道转换保持运行

Stop executing a pipeline transform while other pipeline transforms keep running

google-bigquery

google-cloud-platform

google-cloud-dataflow

停止执行管道转换，而其他管道转换保持 运行

Stop executing a pipeline transform while other pipeline transforms keep running

google-bigquery

google-cloud-platform

google-cloud-dataflow

停止执行管道转换，而其他管道转换保持运行