多处理模块中的 ThreadPool 与 Pool 有什么区别？

Question

multiprocessing 模块中的 ThreadPool 和 Pool 有什么区别。当我尝试我的代码时，这是我看到的主要区别：

from multiprocessing import Pool
import os, time

print("hi outside of main()")

def hello(x):
    print("inside hello()")
    print("Proccess id: ", os.getpid())
    time.sleep(3)
    return x*x

if __name__ == "__main__":
    p = Pool(5)
    pool_output = p.map(hello, range(3))

    print(pool_output)

我看到以下输出：

hi outside of main()
hi outside of main()
hi outside of main()
hi outside of main()
hi outside of main()
hi outside of main()
inside hello()
Proccess id:  13268
inside hello()
Proccess id:  11104
inside hello()
Proccess id:  13064
[0, 1, 4]

与"ThreadPool":

from multiprocessing.pool import ThreadPool
import os, time

print("hi outside of main()")

def hello(x):
    print("inside hello()")
    print("Proccess id: ", os.getpid())
    time.sleep(3)
    return x*x

if __name__ == "__main__":
    p = ThreadPool(5)
    pool_output = p.map(hello, range(3))

    print(pool_output)

我看到以下输出：

hi outside of main()
inside hello()
inside hello()
Proccess id:  15204
Proccess id:  15204
inside hello()
Proccess id:  15204
[0, 1, 4]

我的问题是：

为什么__main__()外面的运行每次都是Pool?
multiprocessing.pool.ThreadPool 没有生成新进程？它只是创建新线程？
如果是这样的话，使用 multiprocessing.pool.ThreadPool 与只使用 threading 模块有什么区别？

我在任何地方都没有看到 ThreadPool 的官方文档，有人可以帮我找到吗？

Answer 1

multiprocessing.pool.ThreadPool 的行为与 multiprocessing.Pool 相同，唯一的区别是使用线程而不是进程来运行工人逻辑。

你看到的原因

hi outside of main()

使用 multiprocessing.Pool 多次打印是因为该池将 spawn 5 个独立进程。每个进程将初始化自己的 Python 解释器并加载导致顶层 print 再次执行的模块。

请注意，只有在使用 spawn 流程创建方法时才会发生这种情况（仅在 Windows 上可用）。如果您使用 fork 一个 (Unix)，您将看到关于线程的消息只打印一次。

multiprocessing.pool.ThreadPool 没有记录，因为它的实现从未完成。它缺乏测试和文档。您可以在 source code.

中看到它的实现

我相信下一个自然问题是：何时使用基于线程的池以及何时使用基于进程的池？

经验法则是：

IO 绑定作业 -> multiprocessing.pool.ThreadPool
CPU 绑定作业 -> multiprocessing.Pool
混合作业 -> 取决于工作量，我通常更喜欢 multiprocessing.Pool，因为进程隔离带来的优势

在 Python 3 上，您可能想看看 concurrent.future.Executor 池实现。

多处理模块中的 ThreadPool 与 Pool 有什么区别？

What's the difference between ThreadPool vs Pool in the multiprocessing module?

python

multiprocessing

threadpool

python-3.x

python-multiprocessing