BigQuery 在 Apache Beam 中插入重试策略

BiqQuery insert retry policy in Apache Beam

A​​pache Beam API 具有以下 BiqQuery 插入重试策略。

BiqQuery 插入重试策略

https://beam.apache.org/releases/javadoc/2.1.0/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.html

背景

 jsonPayload: {
  exception:  "java.lang.RuntimeException: java.io.IOException: Insert failed:
 [{"errors":[{"debugInfo":"","location":"","message":"Value 690000000 for field
 timestamp_scanned of the destination table fr-prd-datalake:rfid_raw.store_epc_transactions_cr_uqjp is outside the allowed bounds.
You can only stream to date range within 365 days in the past and 183 days in
the future relative to the current date.","reason":"invalid"}],

How Dataflow job behave if I specify retryTransientErrors?

所有错误都被视为暂时性错误,除非 BigQuery 指出错误原因是 "invalid", "invalidQuery", "notImplemented"

shouldRetry provides an error from BigQuery and I can decide if I should retry. Where can I find expected error from BigQuery?

你不能,因为错误是 not visible 给调用者的。我不确定这是否是故意的,或者 Apache Beam 是否应该公开错误以便用户可以编写自己的重试逻辑。