'Missing close double quote (") character' 将数据加载到 BigQuery 时,如果 csv 文件中存在换行符,则会被投诉
'Missing close double quote (") character' is complained when there're line feeds in csv file when loading data to BigQuery
罪魁祸首如下。它应该由 14 列组成,其中一列以“嗨,我是尼日尔...”开头,用换行符覆盖多行。
17935,9a7105ee-30c8-4a6d-9374-10875b7d6288.jpg,"""top""=>""0"", ""left""=>""0"", ""width""=>""180"", ""height""=>""180""",,"",2015-07-26 19:33:57.292058,2015-07-26 20:25:30.068887,fe43876f-1b2c-464a-aa20-bf335ed3ff62,c68c8c70-bc2b-11e4-90a1-22000b21105f,{},2e790350-15fb-0133-2cb8-22000ba51078,"Hi I'm Nigerian so wish to study in sweden.
so I'm Undergraduate student I want study Engineering.
Thanks.","",{}
当通过命令 bq load --replace --source_format=CSV -F"," ...
将此 csv 数据加载到 BigQuery 时,报错。谁能给我这个 BigQuery 加载数据命令的解决方案?
- File: 0 / Line:17192 / Field:12: Missing close double quote (")
character: field starts with: <Hi I'm N>
- File: 0 / Line:17193: Too few columns: expected 14 column(s) but
got 1 column(s). For additional help: http://goo.gl/RWuPQ
- File: 0 / Line:17194: Too few columns: expected 14 column(s) but
got 3 column(s). For additional help: http://goo.gl/RWuPQ
如果您要加载带有嵌入换行符的 CSV,则需要指定 allowQuotedNewlines
。
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.allowQuotedNewlines
BigQuery 默认假定 CSV 数据不包含换行符。在处理大型数据文件时,这允许更高的解析吞吐量,因为输入文件可以在任意换行符处拆分。如果您的数据在字符串中包含换行符,则每个文件都需要由一台机器线性解析。
确保在将数据加载到 BigQuery 之前包含此行:'job_config.allow_quoted_newlines = True'
job_config = bigquery.LoadJobConfig()
job_config.allow_quoted_newlines = True
罪魁祸首如下。它应该由 14 列组成,其中一列以“嗨,我是尼日尔...”开头,用换行符覆盖多行。
17935,9a7105ee-30c8-4a6d-9374-10875b7d6288.jpg,"""top""=>""0"", ""left""=>""0"", ""width""=>""180"", ""height""=>""180""",,"",2015-07-26 19:33:57.292058,2015-07-26 20:25:30.068887,fe43876f-1b2c-464a-aa20-bf335ed3ff62,c68c8c70-bc2b-11e4-90a1-22000b21105f,{},2e790350-15fb-0133-2cb8-22000ba51078,"Hi I'm Nigerian so wish to study in sweden.
so I'm Undergraduate student I want study Engineering.
Thanks.","",{}
当通过命令 bq load --replace --source_format=CSV -F"," ...
将此 csv 数据加载到 BigQuery 时,报错。谁能给我这个 BigQuery 加载数据命令的解决方案?
- File: 0 / Line:17192 / Field:12: Missing close double quote (")
character: field starts with: <Hi I'm N>
- File: 0 / Line:17193: Too few columns: expected 14 column(s) but
got 1 column(s). For additional help: http://goo.gl/RWuPQ
- File: 0 / Line:17194: Too few columns: expected 14 column(s) but
got 3 column(s). For additional help: http://goo.gl/RWuPQ
如果您要加载带有嵌入换行符的 CSV,则需要指定 allowQuotedNewlines
。
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.allowQuotedNewlines
BigQuery 默认假定 CSV 数据不包含换行符。在处理大型数据文件时,这允许更高的解析吞吐量,因为输入文件可以在任意换行符处拆分。如果您的数据在字符串中包含换行符,则每个文件都需要由一台机器线性解析。
确保在将数据加载到 BigQuery 之前包含此行:'job_config.allow_quoted_newlines = True'
job_config = bigquery.LoadJobConfig()
job_config.allow_quoted_newlines = True