无法从 BigQuery 读取

Cannot read from BigQuery

我尝试阅读一个简单的 BigQuery table。

这挂在:

WARNING:root:Dataset thijs-dev:temp_dataset_b234824381e04e1324234237724b485f95c does not exist so we will create it as temporary with location=EU

为此,我使用以下脚本:

python main.py \
  --runner DirectRunner \
  --project thijs-dev \
  --temp_location gs://thijs/tmp/ \
  --job_name thijs-dev-load \
  --save_main_session

以及完整的 Python 脚本:

import apache_beam as beam

import logging
import argparse


def run(argv=None):
    parser = argparse.ArgumentParser()
    known_args, pipeline_args = parser.parse_known_args(argv)


    with beam.Pipeline(argv=pipeline_args) as p:
        """ Read all data from source_table """
        source_data = (p | beam.io.Read(beam.io.BigQuerySource(query="select * from `thijs-dev.metathijs.thijs_locations`", use_standard_sql=True)))


if __name__ == '__main__':
    print("Start")
    logging.getLogger().setLevel(logging.INFO)
    run()

原来数据流非常慢。处理26MB的数据需要半小时,但它仍然可以工作。