如何使用 bq 加载命令加载多个 AVRO 文件

How to load multiple AVRO files using a bq load command

我正在尝试按照此文档将多个 AVRO 文件加载到大查询中:

https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro

根据文档,执行此操作的命令是:

bq --location=US load --source_format=AVRO [DATASET].[TABLE_NAME] "gs://mybucket/00/*.avro","gs://mybucket/01/*.avro"

我创建了一个搜索文件的脚本并像这样挂载命令:

bq load --source_format=AVRO --noreplace foo.bar3456  "gs://mybucket/foo/36.avro", "gs://mybucket/foo_bar/01.avro", "gs://mybucket/bar/211.avro"

但这只有在我有一个这样的文件时才有效:

bq load --source_format=AVRO --noreplace foo.bar3456 "gs://mybucket/foo/36.avro"

当我尝试对多个文件使用该命令时,错误是:

Too many positional args, still have ["gs://mybucket/foo_bar/01.avro"]

这是我创建命令的脚本:

def create_command_bq_load(buckets):
    for x, bucket in enumerate(buckets):
        command =  'bq load --source_format=AVRO --noreplace %s.%s_%s$%s' % (datasetname,  bucket['product'], bucket['event'],  bucket['data_partition'])
        if bucket['files']:
            command_file = ''
            for x in range(len(bucket['files'])):    
                command_file = '%s "%s",' % (command_file, bucket['files'][x])   
                command_file = command_file
            commands.append((command + ' ' + command_file)[:-1])
    return commands

有帮助吗?

已解决,我的错误是两个文件之间的 space ' ' 字符。正确的做法是:

bq load --source_format=AVRO --noreplace foo.bar3456 "gs://mybucket/foo/36.avro","gs://mybucket/foo_bar/01.avro","gs://mybucket/bar/211.avro"