运行在 Python 中使用 Pool.map() 和 range() 编码时的奇怪行为 3

Question

我在使用 multiprocessing 和 range() 生成器时遇到了一些奇怪的行为，我无法弄清楚发生了什么。

代码如下：

from multiprocessing import Pool
import time

def worker_thread(param):
    time.sleep(1)
    print(param, end=' ', flush=True)

p = Pool(1)
inp = list(range(0, 100))

p.map(worker_thread, inp)

执行此代码时（只有 1 个线程），输出符合预期：

0 1 2 3 4 5 6 7 ...

但是，当我将线程数增加到比方说 2 时，输出变得无法解释：

0 13 1 14 2 15 3 16 4 17 ...

依此类推，此行为出现在更高线程数时。既然 list(range(0,100)) 生成一个从 0 到 99 的升序排列的数字列表，为什么 map() 不按它所在的顺序扫描列表？

Answer 1

您看到打印的项目以意外的顺序出现，因为 multiprocessing.Pool.map 将输入分成由每个工作进程处理的块。 is documented（添加斜体以强调重要位）：

map(func, iterable[, chunksize])

A parallel equivalent of the map() built-in function (it supports only one iterable argument though). It blocks until the result is ready.

This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.

在您显示的示例输出中，看起来 Python 选择了 chunksize 13，因为您没有指定自己的大小。尝试将 1 作为 chunksize 传递，我认为您会得到预期的输出（不过可能会以降低性能为代价）。

运行在 Python 中使用 Pool.map() 和 range() 编码时的奇怪行为 3

Weird behavior when running code with Pool.map() and range() in Python 3

python

python-3.x

python-multiprocessing

运行 在 Python 中使用 Pool.map() 和 range() 编码时的奇怪行为 3

Weird behavior when running code with Pool.map() and range() in Python 3

python

python-3.x

python-multiprocessing

运行在 Python 中使用 Pool.map() 和 range() 编码时的奇怪行为 3