IOError: No files found based on the file pattern
IOError: No files found based on the file pattern
我正在尝试 运行 在 Python SDK 中找到的示例。但是,堆栈跟踪会出错,如下所示。注意:第一个管道确实创建了“./names”文件,但第二个管道似乎无法从中读取。
No handlers could be found for logger "oauth2client.contrib.multistore_file"
Traceback (most recent call last):
File "example.py", line 17, in <module>
| 'save' >> beam.io.WriteToText(greetings_file))
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/textio.py", line 391, in __init__
skip_header_lines=skip_header_lines)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/textio.py", line 88, in __init__
validate=validate)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsource.py", line 97, in __init__
self._validate()
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsource.py", line 173, in _validate
'No files found based on the file pattern %s' % self._pattern)
IOError: No files found based on the file pattern ./names
示例代码如下:
import apache_beam as beam
def add_greeting(name, messages):
for msg in messages:
yield '%s %s' % (msg, name)
names_file = './names'
greetings_file = './greetings'
p = beam.Pipeline('DirectRunner')
(p | 'add names' >> beam.Create(['Ann', 'Joe'])
| 'save' >> beam.io.WriteToText(names_file))
p.run()
(p
| 'load names' >> beam.io.ReadFromText(names_file)
| 'add greetings' >> beam.FlatMap(add_greetings, ['Hello', 'Hola'])
| 'save' >> beam.io.WriteToText(greetings_file))
p.run()
环境:我运行在 google 云 shell
上安装它
$ pip list --local --format=columns | grep dataflow
google-cloud-dataflow 0.6.0
当管道 运行s 时,Beam 中的 运行ners 不会等待它完成,因此您应该在调用 [=12] 之后添加对 wait_until_finish()
的调用=].
此外,Beam 管道具有 延迟执行,因此当您为管道定义新步骤时,它们会添加到图表中,每次您 运行 你的流水线。这意味着,简而言之,如果您想要一个具有 运行 个不同步骤的管道,则需要创建一个新的 Pipeline
对象。
这应该有效:
p = beam.Pipeline('DirectRunner')
(p | 'add names' >> beam.Create(['Ann', 'Joe'])
| 'save' >> beam.io.WriteToText('./names'))
p.run().wait_until_finish()
p = beam.Pipeline('DirectRunner')
(p
| 'load names' >> beam.io.ReadFromText('./names*')
| 'add greetings' >> beam.FlatMap(add_greeting, ['Hello', 'Hola'])
| 'save' >> beam.io.WriteToText(greetings_file))
p.run().wait_until_finish()
我正在尝试 运行 在 Python SDK 中找到的示例。但是,堆栈跟踪会出错,如下所示。注意:第一个管道确实创建了“./names”文件,但第二个管道似乎无法从中读取。
No handlers could be found for logger "oauth2client.contrib.multistore_file"
Traceback (most recent call last):
File "example.py", line 17, in <module>
| 'save' >> beam.io.WriteToText(greetings_file))
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/textio.py", line 391, in __init__
skip_header_lines=skip_header_lines)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/textio.py", line 88, in __init__
validate=validate)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsource.py", line 97, in __init__
self._validate()
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsource.py", line 173, in _validate
'No files found based on the file pattern %s' % self._pattern)
IOError: No files found based on the file pattern ./names
示例代码如下:
import apache_beam as beam
def add_greeting(name, messages):
for msg in messages:
yield '%s %s' % (msg, name)
names_file = './names'
greetings_file = './greetings'
p = beam.Pipeline('DirectRunner')
(p | 'add names' >> beam.Create(['Ann', 'Joe'])
| 'save' >> beam.io.WriteToText(names_file))
p.run()
(p
| 'load names' >> beam.io.ReadFromText(names_file)
| 'add greetings' >> beam.FlatMap(add_greetings, ['Hello', 'Hola'])
| 'save' >> beam.io.WriteToText(greetings_file))
p.run()
环境:我运行在 google 云 shell
上安装它$ pip list --local --format=columns | grep dataflow
google-cloud-dataflow 0.6.0
当管道 运行s 时,Beam 中的 运行ners 不会等待它完成,因此您应该在调用 [=12] 之后添加对 wait_until_finish()
的调用=].
此外,Beam 管道具有 延迟执行,因此当您为管道定义新步骤时,它们会添加到图表中,每次您 运行 你的流水线。这意味着,简而言之,如果您想要一个具有 运行 个不同步骤的管道,则需要创建一个新的 Pipeline
对象。
这应该有效:
p = beam.Pipeline('DirectRunner')
(p | 'add names' >> beam.Create(['Ann', 'Joe'])
| 'save' >> beam.io.WriteToText('./names'))
p.run().wait_until_finish()
p = beam.Pipeline('DirectRunner')
(p
| 'load names' >> beam.io.ReadFromText('./names*')
| 'add greetings' >> beam.FlatMap(add_greeting, ['Hello', 'Hola'])
| 'save' >> beam.io.WriteToText(greetings_file))
p.run().wait_until_finish()