dask 中的 compute() 不工作

compute() in dask not working

我正在尝试在 Dask 中进行简单的并行计算。 这是我的代码。

  import time
  import dask as dask
  import dask.distributed as distributed
  import dask.dataframe as dd
  import dask.delayed as delayed
  from dask.distributed import Client,progress

  client = Client('localhost:8786')
  df = dd.read_csv('file.csv')
  ddf = df.groupby(['col1'])[['col2']].sum() 
  ddf = ddf.compute()
  print ddf

从文档来看似乎没问题,但在 运行 我得到了这个 :

    Traceback (most recent call last):
    File "dask_prg1.py", line 17, in <module>
    ddf = ddf.compute()
    File "/usr/local/lib/python2.7/site-packages/dask/base.py", line 156, in compute
   (result,) = compute(self, traverse=False, **kwargs)
    File "/usr/local/lib/python2.7/site-packages/dask/base.py", line 402, in compute
   results = schedule(dsk, keys, **kwargs)
   File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 2159, in get
direct=direct)
  File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 1562, in gather
asynchronous=asynchronous)
 File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 652, in sync
return sync(self.loop, func, *args, **kwargs)
 File "/usr/local/lib/python2.7/site-packages/distributed/utils.py", line 275, in sync
six.reraise(*error[0])
 File "/usr/local/lib/python2.7/site-packages/distributed/utils.py", line 260, in f
result[0] = yield make_coro()
   File "/usr/local/lib/python2.7/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
 File "/usr/local/lib/python2.7/site-packages/tornado/concurrent.py", line 260, in result
raise_exc_info(self._exc_info)
 File "/usr/local/lib/python2.7/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
 File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 1439, in _gather
traceback)
File "/usr/local/lib/python2.7/site-packages/dask/bytes/core.py", line 122, in read_block_from_file
with lazy_file as f:
File "/usr/local/lib/python2.7/site-packages/dask/bytes/core.py", line 166, in __enter__
f = SeekableFile(self.fs.open(self.path, mode=mode))
 File "/usr/local/lib/python2.7/site-packages/dask/bytes/local.py", line 58, in open
return open(self._normalize_path(path), mode=mode)
 IOError: [Errno 2] No such file or directory: 'file.csv'

我不明白什么是 wrong.Kindly 帮我解决这个问题。提前谢谢你。

您可能希望将绝对文件路径传递给 read_csv。原因是,您将打开和读取文件的工作交给了 dask worker,而您可能还没有开始使用与 script/session.

相同的工作目录