dask 分布式,无法启动 worker
dask distributed , fail to start worker
在某些情况下,dask 集群似乎在重启时挂起
为了模拟这个我写了这个愚蠢的代码:
import contextlib2
from distributed import Client, LocalCluster
for i in xrange(100):
print i
with contextlib2.ExitStack() as es:
cluster = LocalCluster(processes=True, n_workers=4)
client = Client(cluster)
es.callback(client.close)
es.callback(es.callback(client.close))
这段代码永远不会完成循环
我收到此错误
raise_exc_info(self._exc_info)
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "//anaconda/lib/python2.7/site-packages/distributed/deploy/local.py", line 191, in _start
yield [self._start_worker(**self.worker_kwargs) for i in range(n_workers)]
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "//anaconda/lib/python2.7/site-packages/tornado/concurrent.py", line 269, in result
raise_exc_info(self._exc_info)
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "//anaconda/lib/python2.7/site-packages/tornado/concurrent.py", line 269, in result
raise_exc_info(self._exc_info)
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "//anaconda/lib/python2.7/site-packages/distributed/deploy/local.py", line 217, in _start_worker
raise gen.TimeoutError("Worker failed to start")
我在 mac
上使用 dask 分布式 1.25.1 和 python 2.7 运行
这是 Dask 中的一个问题,而在 linux 上使用 python 2.7 时,启动新工作程序(多进程)的唯一方法是使用 fork
fork反过来可能会造成死锁
详情
查看 dask 打开的票
https://github.com/dask/distributed/issues/2446
在某些情况下,dask 集群似乎在重启时挂起
为了模拟这个我写了这个愚蠢的代码:
import contextlib2
from distributed import Client, LocalCluster
for i in xrange(100):
print i
with contextlib2.ExitStack() as es:
cluster = LocalCluster(processes=True, n_workers=4)
client = Client(cluster)
es.callback(client.close)
es.callback(es.callback(client.close))
这段代码永远不会完成循环 我收到此错误
raise_exc_info(self._exc_info)
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "//anaconda/lib/python2.7/site-packages/distributed/deploy/local.py", line 191, in _start
yield [self._start_worker(**self.worker_kwargs) for i in range(n_workers)]
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "//anaconda/lib/python2.7/site-packages/tornado/concurrent.py", line 269, in result
raise_exc_info(self._exc_info)
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "//anaconda/lib/python2.7/site-packages/tornado/concurrent.py", line 269, in result
raise_exc_info(self._exc_info)
File "//anaconda/lib/python2.7/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "//anaconda/lib/python2.7/site-packages/distributed/deploy/local.py", line 217, in _start_worker
raise gen.TimeoutError("Worker failed to start")
我在 mac
上使用 dask 分布式 1.25.1 和 python 2.7 运行这是 Dask 中的一个问题,而在 linux 上使用 python 2.7 时,启动新工作程序(多进程)的唯一方法是使用 fork
fork反过来可能会造成死锁 详情 查看 dask 打开的票 https://github.com/dask/distributed/issues/2446