运行 具有不同参数的并行函数 - python

Run a function in parallel with different arguments - python

我有一个函数 slow_function,它需要大约 200 秒来处理一个 job_title,它从全局变量读取和写入。

使用此代码没有提高性能。我是不是遗漏了什么,但是 returns 结果相同。

并行编码运行五个工作类别:

    from threading import Thread
    threads = []

    start = time.time()
    for job_title in self.job_titles:
        t = Thread(target=self.slow_function, args=(job_title,))
        threads.append(t)
    # Start all threads
    for x in threads:
        x.start()

     # Wait for all of them to finish
    for x in threads:
        x.join()
    end = time.time()
    print "New time taken for all jobs:", end - start

您需要使用多处理 (https://docs.python.org/2/library/multiprocessing.html) module, since the threading module is limited by the GIL (https://docs.python.org/2/glossary.html#term-global-interpreter-lock)。

但是您不能使用全局变量在生成的进程之间交换数据!!! ... 参见 https://docs.python.org/2/library/multiprocessing.html#exchanging-objects-between-processes

您应该从 class 方法中提取 slow_function,因为不可能在进程之间共享本地上下文。然后你可以使用这个代码:

from multiprocessing import Pool

start = time.time()

pool = Pool()

results = pool.map(slow_function, self.job_titles)

for r in results:
   # update your `global` variables here

end = time.time()
print "New time taken for all jobs:", end - start