python 多处理速度慢

Question

我有一些并行调用函数的代码。在函数内部，我检查文件是否存在，如果不存在则创建它，否则什么也不做。

我发现如果文件确实存在，那么与简单的 for 循环相比，调用 multiprocessing.process 会有相当大的时间损失。这是预期的还是我可以做些什么来减少处罚？

def fn():
    # Check if file exists, if yes then return else make the file
    if(not(os.path.isfile(fl))):
        # processing takes enough time to make the paralleization worth it
    else:
        print 'file exists'


pkg_num = 0
total_runs    = 2500
threads = []

while pkg_num < total_runs or len(threads):
    if(len(threads) < 3 and pkg_num < total_runs):
        t = multiprocessing.Process(target=fn,args=[])
        pkg_num = pkg_num + 1
        t.start()
        threads.append(t)
    else:
        for thread in threads:
            if not thread.is_alive():
                threads.remove(thread)

Answer 1

启动进程会产生相当大的开销——您必须权衡创建这些进程的开销与使任务并发获得的性能优势。我不确定一个简单的 OS 呼吁是否有足够的好处值得它。

还有，为了子孙后代，你真的应该看看concurrent.futures.ProcessPoolExecutor；方式，更清洁。如果您使用 2.7，您可以向后移植它。

python 多处理速度慢

python multiprocessing slow

python

parallel-processing

multiprocessing