Google 数据流 python 快速启动错误 - GcsIO 没有属性
Google Dataflow python quickstart error - GcsIO has no attribute
我一直在关注 Dataflow Python Quickstart 并在 运行 wordcount 示例管道时遇到错误:
...
File "apache_beam/io/fileio.py", line 281, in glob
return gcsio.GcsIO().glob(path, limit)
AttributeError: 'NoneType' object has no attribute 'GcsIO'
我用我自己的管道试过,结果相同。我不确定这里的问题是什么,因为我认为我完全遵循了教程并且这个错误似乎与 read/write transform
有关
Traceback (most recent call last): File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py",
line 162, in _run_module_as_main
"main", fname, loader, pkg_name) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py",
line 72, in _run_code
exec code in run_globals File "/Users/Alex/beam/sdks/python/apache_beam/examples/wordcount.py", line
116, in
run() File "/Users/Alex/beam/sdks/python/apache_beam/examples/wordcount.py", line
87, in run
lines = p | 'read' >> ReadFromText(known_args.input) File "apache_beam/io/textio.py", line 378, in init
skip_header_lines=skip_header_lines) File "apache_beam/io/textio.py", line 87, in init
validate=validate) File "apache_beam/io/filebasedsource.py", line 97, in init
self._validate() File "apache_beam/io/filebasedsource.py", line 171, in _validate
if len(fileio.ChannelFactory.glob(self._pattern, limit=1)) <= 0: File "apache_beam/io/fileio.py", line 281, in glob
return gcsio.GcsIO().glob(path, limit) AttributeError: 'NoneType' object has no attribute 'GcsIO'
知道我做错了什么吗?
谢谢
发生这种情况是因为您没有安装 google-apitools
包(代码中提到了这一点,但应该更好地记录)。
在您的虚拟环境中尝试 运行 pip install google-apitools
,然后重新运行管道(请注意,您的系统中需要 Google 云凭据)。
只是安装 google-apitools
并没有解决我的问题。我必须直接从源代码安装 SDK,包括它的 gcp
依赖项,这些依赖项在 SDKs egg-info:
中的 requires.txt
中定义
# run this in your virtualenv
SDK_PATH=sdks/python
pip install -e $SDK_PATH[gcp]
通过 gcloud auth application-default login
登录,然后我可以成功 运行 wordcount 示例。
编辑: 重写了答案,因为之前的解决方案没有按预期工作。
我一直在关注 Dataflow Python Quickstart 并在 运行 wordcount 示例管道时遇到错误:
... File "apache_beam/io/fileio.py", line 281, in glob return gcsio.GcsIO().glob(path, limit) AttributeError: 'NoneType' object has no attribute 'GcsIO'
我用我自己的管道试过,结果相同。我不确定这里的问题是什么,因为我认为我完全遵循了教程并且这个错误似乎与 read/write transform
有关Traceback (most recent call last): File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main "main", fname, loader, pkg_name) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/Users/Alex/beam/sdks/python/apache_beam/examples/wordcount.py", line 116, in run() File "/Users/Alex/beam/sdks/python/apache_beam/examples/wordcount.py", line 87, in run lines = p | 'read' >> ReadFromText(known_args.input) File "apache_beam/io/textio.py", line 378, in init skip_header_lines=skip_header_lines) File "apache_beam/io/textio.py", line 87, in init validate=validate) File "apache_beam/io/filebasedsource.py", line 97, in init self._validate() File "apache_beam/io/filebasedsource.py", line 171, in _validate if len(fileio.ChannelFactory.glob(self._pattern, limit=1)) <= 0: File "apache_beam/io/fileio.py", line 281, in glob return gcsio.GcsIO().glob(path, limit) AttributeError: 'NoneType' object has no attribute 'GcsIO'
知道我做错了什么吗?
谢谢
发生这种情况是因为您没有安装 google-apitools
包(代码中提到了这一点,但应该更好地记录)。
在您的虚拟环境中尝试 运行 pip install google-apitools
,然后重新运行管道(请注意,您的系统中需要 Google 云凭据)。
只是安装 google-apitools
并没有解决我的问题。我必须直接从源代码安装 SDK,包括它的 gcp
依赖项,这些依赖项在 SDKs egg-info:
requires.txt
中定义
# run this in your virtualenv
SDK_PATH=sdks/python
pip install -e $SDK_PATH[gcp]
通过 gcloud auth application-default login
登录,然后我可以成功 运行 wordcount 示例。
编辑: 重写了答案,因为之前的解决方案没有按预期工作。