通过 Python 脚本使用 Google BigQuery
using Google BigQuery through Python script
我想通过 python 脚本在 BigQuery 上完成一些非常简单的任务。我发现 this package 效果不佳。事实上,当我尝试这段代码时:
from bigquery import get_client
project_id = 'txxxxxxxxxxxxxxxxxx9'
# Service account email address as listed in the Google Developers Console.
service_account = '7xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com'
# PKCS12 or PEM key provided by Google.
key = '/home/fxxxxxxxxxxxx/Dropbox/access_keys/google_storage/xxxxxxxxxxxxxxxxxxxxx.pem'
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
# Submit an async query.
results = client.get_table_schema('newdataset', 'newtable2')
print('results')
我收到这个错误:
/home/xxxxxx/anaconda3/envs/snakes/bin/python2.7 /home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py
Traceback (most recent call last):
File "/home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py", line 9, in <module>
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 83, in get_client
readonly=readonly)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 101, in _get_bq_service
service = build('bigquery', 'v2', http=http)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/util.py", line 142, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 196, in build
cache)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 242, in _retrieve_discovery_doc
resp, content = http.request(actual_url)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 565, in new_request
self._refresh(request_orig)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 835, in _refresh
self._do_refresh_request(http_request)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 862, in _do_refresh_request
body = self._generate_refresh_request_body()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1541, in _generate_refresh_request_body
assertion = self._generate_assertion()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1670, in _generate_assertion
private_key, self.private_key_password), payload)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/_pycrypto_crypt.py", line 121, in from_string
pkey = RSA.importKey(parsed_pem_key)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 665, in importKey
return self._importKeyDER(der)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 588, in _importKeyDER
raise ValueError("RSA key format is not supported")
ValueError: RSA key format is not supported
Process finished with exit code 1
我的问题:python 中是否有教程显示如何轻松地与 BigQuery 通信:从 google 存储或 S3 导入数据集,查询内容,将结果导出到 google 存储.
很大程度上取决于您的环境,一旦您了解了这一点,一切都应该非常简单。我在您粘贴的错误日志中看到唯一的问题是确定身份验证。
Python pandas 支持 BigQuery 有一段时间了:
我和模块的创建者一起制作了一个视频:
现在,现在使用您提到的所有 Google 云产品启动 Jupyter notebook 的最简单、最快的方法是我们新的 Google Datalab 项目:
Datalab 唯一需要注意的是它可以在云服务器上运行,但如果您想要一个完全托管的 Jupyter/IPython 环境,完全安全、持久并准备好处理 BigQuery、存储等...试试吧出。
同时,如果您正在编写 Web 应用程序,请了解其他 Web 应用程序如何解决此任务。
例如,re:dash 连接到 BigQuery 的代码:
https://github.com/EverythingMe/redash/blob/master/redash/query_runner/big_query.py
我想通过 python 脚本在 BigQuery 上完成一些非常简单的任务。我发现 this package 效果不佳。事实上,当我尝试这段代码时:
from bigquery import get_client
project_id = 'txxxxxxxxxxxxxxxxxx9'
# Service account email address as listed in the Google Developers Console.
service_account = '7xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com'
# PKCS12 or PEM key provided by Google.
key = '/home/fxxxxxxxxxxxx/Dropbox/access_keys/google_storage/xxxxxxxxxxxxxxxxxxxxx.pem'
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
# Submit an async query.
results = client.get_table_schema('newdataset', 'newtable2')
print('results')
我收到这个错误:
/home/xxxxxx/anaconda3/envs/snakes/bin/python2.7 /home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py
Traceback (most recent call last):
File "/home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py", line 9, in <module>
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 83, in get_client
readonly=readonly)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 101, in _get_bq_service
service = build('bigquery', 'v2', http=http)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/util.py", line 142, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 196, in build
cache)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 242, in _retrieve_discovery_doc
resp, content = http.request(actual_url)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 565, in new_request
self._refresh(request_orig)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 835, in _refresh
self._do_refresh_request(http_request)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 862, in _do_refresh_request
body = self._generate_refresh_request_body()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1541, in _generate_refresh_request_body
assertion = self._generate_assertion()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1670, in _generate_assertion
private_key, self.private_key_password), payload)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/_pycrypto_crypt.py", line 121, in from_string
pkey = RSA.importKey(parsed_pem_key)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 665, in importKey
return self._importKeyDER(der)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 588, in _importKeyDER
raise ValueError("RSA key format is not supported")
ValueError: RSA key format is not supported
Process finished with exit code 1
我的问题:python 中是否有教程显示如何轻松地与 BigQuery 通信:从 google 存储或 S3 导入数据集,查询内容,将结果导出到 google 存储.
很大程度上取决于您的环境,一旦您了解了这一点,一切都应该非常简单。我在您粘贴的错误日志中看到唯一的问题是确定身份验证。
Python pandas 支持 BigQuery 有一段时间了:
我和模块的创建者一起制作了一个视频:
现在,现在使用您提到的所有 Google 云产品启动 Jupyter notebook 的最简单、最快的方法是我们新的 Google Datalab 项目:
Datalab 唯一需要注意的是它可以在云服务器上运行,但如果您想要一个完全托管的 Jupyter/IPython 环境,完全安全、持久并准备好处理 BigQuery、存储等...试试吧出。
同时,如果您正在编写 Web 应用程序,请了解其他 Web 应用程序如何解决此任务。
例如,re:dash 连接到 BigQuery 的代码:
https://github.com/EverythingMe/redash/blob/master/redash/query_runner/big_query.py