来自 joblib 的并行函数 运行 除了函数之外的整个代码

Parallel function from joblib running whole code apart from functions

我正在使用 Python 中 joblib 包中的 Parallel 函数。我只想使用这个函数来处理我的一个函数,但不幸的是整个代码是 运行 并行的(除了其他函数)。

示例:

from joblib import Parallel, delayed
print ('I do not want this to be printed n times')
def do_something(arg):
    some calculations(arg)

Parallel(n_jobs=5)(delayed(do_something)(i) for i in range(0, n))

这是一个常见的错误,会错过文档中的设计方向。许多用户遇到了同样的经历。

文档非常清楚 除了 def-s 之外不放置任何代码 __main__ 保险丝.

如果不这样做,错误确实会喷出,事情会变得很糟糕,但是,对 re-read 文档的明确建议仍然存在,在屏幕上无限泄漏:

[joblib] Attempting to do parallel computing
without protecting your import on a system that does not support forking.

To use parallel-computing in a script, you must protect your main loop
using "if __name__ == '__main__'".

Please see the joblib documentation on Parallel for more information

解决方案:

正确完成第一期,报告w.r.t。融合 import 保护,事情会变得更好:

C:\Python27.anaconda>python joblib_example.py
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...

接下来进行最后的润色,大功告成:

from sklearn.externals.joblib  import Parallel, delayed

def do_some_thing( arg ):
    pass
    return True

if  __name__ == '__main__': #################################### A __main__ FUSE:

    pass;                                   n = 6
    print "I do not want this to be printed n-times..."

    Parallel( n_jobs = 5 ) ( delayed( do_some_thing )( i )
                                                   for i in range( 0, n )
                             )

C:\Python27.anaconda>python joblib_example.py
I do not want this to be printed n-times...

C:\Python27.anaconda>