SourceOperationExecutor.isSplitOperationTooLargeForDataflowService 数据流管道中的 NPE

NPE in dataflow pipeline at SourceOperationExecutor.isSplitOperationTooLargeForDataflowService

我的数据流管道在上次 运行 之前一直 运行 正常。今天,当我 运行 它在一个新数据集上时,我开始收到 NullPointerException。问题是异常似乎不是来自我的代码(堆栈跟踪中的任何地方),如下所示-

这是数据流框架中的错误还是(异常似乎发生在 isSplitOperationTooLargeForDataflowService 中),这个数据集,更准确地说是它的拆分,对于数据流来说太大了?

任何 help/insight 将不胜感激!

2016-07-04T16:27:00.044Z: Error:   (fb0b4effcb8800a6):    
java.lang.NullPointerException
at com.google.cloud.dataflow.sdk.runners.worker.SourceOperationExecutor.isSplitOperationTooLargeForDataflowService(SourceOperationExecutor.java:100)
at com.google.cloud.dataflow.sdk.runners.worker.SourceOperationExecutor.isSplitResponseTooLarge(SourceOperationExecutor.java:92)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:227)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:146)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.doWork(DataflowWorkerHarness.java:164)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:145)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:132)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

这是 Dataflow SDK 1.4.0 版中修复的错误。在撰写本文时,SDK 的最新版本是 1.6.0。

如果 Eclipse 插件在 1.2.1 版本中显示 "up to date",这听起来像是您在使用 Eclipse 插件时遇到了问题。如果您手动更新 pom.xml 以使用 SDK 1.6.0 版本,您的问题应该会得到解决。