COPY INTO Snowflake Table 带有额外的列

COPY INTO Snowflake Table with Extra Columns

我有一个 table 在 Snowflake 中定义为:

GLPCT

BATCH_KEY NUMBER(38,0) NULL
CTACCT VARCHAR(100) NULL
CTPAGE NUMBER(38,0) NULL

和一个如下所示的文件:

GLPCT.csv

CTACCT VARCHAR(100)
CTPAGE NUMBER(38,0)

示例:

CTACCT,CTPAGE
"Test Account",100
"Second Account", 200

我复制到命令中是这样的:

copy into GLPCT_POC from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"'); 

问题

Snowflake 由于列号不匹配而抛出错误。如何让 Snowflake 忽略文件中不存在的列而不抛出错误?如果有帮助,我可以将 BATCH_KEY 移动到 table 的末尾。

您可以添加一个 "transformation",因为您将数据复制到查询中。在这种情况下,您的转换可以是添加一个 NULL 列。

但是,为了使用此功能,您需要为外部源创建一个舞台

create or replace stage my_stage 
url='azure://ouraccount.blob.core.windows.net/landing/GLPCT'
credentials=(azure_sas_token='<SAS_TOKEN')
file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

copy into GLPCT_POC 
from (SELECT NULL, ,  FROM @my_stage);

$1和$2与文件中的列对齐,然后select子句中的列顺序与table.[=12=的列对齐]

这样做的额外好处是,如果您要重复使用该复制语句 and/or 阶段,则无需重复所有凭据和文件格式信息。

Data load with transformation syntax

似乎可以使用 COPY INTO 语句指示要插入哪些列,因此我们的变成:

copy into GLPCT_POC (CTACCT, CTPAGE) from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

由于这是一个外部文件,我们无法使用之前回答中提到的转换。

Snowflake 允许您在文件格式中设置ERROR_ON_COLUMN_COUNT_MISMATCH。

ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE Boolean that specifies whether to generate a parsing error if the number of delimited columns (i.e. fields) in an input data file does not match the number of columns in the corresponding table.

If set to FALSE, an error is not generated and the load continues. If the file is successfully loaded:

If the input file contains records with more fields than columns in the table, the matching fields are loaded in order of occurrence in the file and the remaining fields are not loaded.

If the input file contains records with fewer fields than columns in the table, the non-matching columns in the table are loaded with NULL values.

https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-csv