从 Spark ETL 重置 BigQuery Table

Reset BigQuery Table from Spark ETL

我有一个问题要问你。如果我有一个内置于 Databricks 的 ETL,它将数据加载到 BigQuery 中,但我希望在 ETL 的每个 运行 之前擦除 BigQuery table,这可能吗?抱歉新手问题!谢谢!!!

当您加载数据时,configuration.load 属性 或 jobs.insert 下有两个属性(以及许多其他属性)可用于控制 [=29] 发生的情况和方式=] 你加载到:

configuration.load.writeDisposition

[Optional] Specifies the action that occurs if the destination table already exists.

The following values are supported:
WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data. WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.
The default value is WRITE_APPEND.

Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion.

configuration.load.createDisposition

[Optional] Specifies whether the job is allowed to create new tables.

The following values are supported:
CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table.
CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result.
The default value is CREATE_IF_NEEDED.

Creation, truncation and append actions occur as one atomic update upon job completion.

那么,WRITE_TRUNCATE 就是您要找的