Python 多处理池在创建时与命名空间交互
Python multiprocessing Pool Interaction With Namespace At Creation
我们知道multiprocessing.Pool
必须在函数定义后初始化运行。但是我发现下面的代码对我来说是难以理解的
import os
from multiprocessing import Pool
def func(i): print('first')
pool1 = Pool(2)
pool1.map(func, range(2)) #map-1
def func(i): print('second')
func2 = func
print('------')
pool1.map(func, range(2)) #map-2
pool1.map(func2, range(2)) #map-3
pool2 = Pool(2)
print('------')
pool2.map(func, range(2)) #map-4
pool2.map(func2, range(2)) #map-5
输出(python2.7 和 python3.4 在 linux 上)是
first #map-1
first
------
first #map-2
first
first #map-3
first
------
second #map-4
second
second #map-5
second
map-2
打印出 'first'
正如我们预期的那样。
但是 map-3
是如何找到名字 func2
的呢?我的意思是 pool1
在 func2
第一次出现之前被初始化。所以 func2 = func
确实被执行了,而 def func(i): print('second')
没有。为什么?
如果我直接通过
定义func2
def func2(i): print('second')
然后 map-3
将找不到许多帖子提到的名称 func2
,例如。 this one。两种情况有什么区别?
据我所知,参数是通过酸洗传递给从属进程的,但是
pool
如何将调用的函数传递给其他进程?或者子进程如何找到调用的函数?
tl;dr:map-3
处的问题,其中第一个 func
被调用,而人们期望第二个 func
是因为 Pool.map()
使用 pickle 序列化 func.__name__
,即使它被分配给 func2
引用,它也解析为 func
,并被发送到 child 进程,它在本地查找 func
到 child 进程。
好的,所以我可以数出下面列出的四个不同的问题,我认为您已经讲过名称空间和分叉过程,直接进入您的问题的乐趣☺
① But how does map-3 find the name func2?
② So func2 = func is indeed executed, while def func(i): print('second') is not. Why?
③ Then map-3 won't find name func2 as mentioned by many posts, eg. this one. What's the difference between two cases?
④ As I understand the arguments are passed to the slave processes by pickling, but how does pool pass the called function to other processes? Or how do sub-processes find the called function?
所以我添加了更多代码,以展示更多内部结构:
import os
from multiprocessing import Pool
print(os.getpid(), 'parent')
def func(i):
print(os.getpid(), 'first', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
print('------ map-1')
pool1 = Pool(2)
pool1.map(func, range(2)) #map-1
def func(i):
print(os.getpid(), 'second', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
func2 = func
print('------ map-2')
pool1.map(func, range(2)) #map-2
print('------ map-3')
pool1.map(func2, range(2)) #map-3
pool2 = Pool(2)
print('------ map-4')
pool2.map(func, range(2)) #map-4
print('------ map-5')
pool2.map(func2, range(2)) #map-5
我的系统输出:
21512 parent
------ map-1
21513 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
21514 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
------ map-2
21513 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
21514 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
------ map-3
21513 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
21514 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
------ map-4
21518 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
21519 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
------ map-5
21518 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
21519 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
所以,我们可以看到对于 pool1
,从来没有将 func2
添加到命名空间。所以那里肯定有什么可疑的事情发生,我已经太晚了,无法彻底查看 multiprocessing
的源代码和调试器以了解正在发生的事情。
因此,如果我不得不猜测 ① 的答案,pickle
模块会以某种方式发现 func2
解析为 0x7f62d531bed8
,它已经与标签 [=25] 一起存在=],因此它会在 children 一侧腌制已知的“标签”func
,解析为 0x7f62d67f7cf8
。即:
func2 → 0x7f62d531bed8 → func → [PICKLE] → globals()['func'] → 0x7f62d67f7cf8
为了检验我的理论,我稍微更改了您的代码,将第二个 func()
重命名为 func2()
,这就是我得到的结果:
------ map-3
Process PoolWorker-1:
Process PoolWorker-2:
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
return recv()
AttributeError: 'module' object has no attribute 'func2'
AttributeError: 'module' object has no attribute 'func2'
然后把func = func2
也改成func2 = func
------ map-2
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Process PoolWorker-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
return recv()
AttributeError: 'module' object has no attribute 'func2'
AttributeError: 'module' object has no attribute 'func2'
所以我相信我开始表达观点了。而且,它还显示了在 children 进程方面阅读代码以了解正在发生的事情的位置。
以便更多的线索来回答②和③!
为了更进一步,我在 pool.py
第 114 行中添加了一个打印语句:
job, i, func, args, kwds = task
print("XXX", os.getpid(), job, i, func, args, kwds)
显示正在发生的事情。我们可以看到 func
被解析为 0x7f2d0238fcf8
,这与 parent 函数中的地址相同:
23432 parent
------ map-1
('XXX', 23433, 0, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (0,)),), {})
23433 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
('XXX', 23434, 0, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (1,)),), {})
23434 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
------ map-2
('XXX', 23433, 1, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (0,)),), {})
23433 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
('XXX', 23434, 1, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (1,)),), {})
23434 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
------ map-3
('XXX', 23433, 2, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (0,)),), {})
23433 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
('XXX', 23434, 2, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (1,)),), {})
23434 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
------ map-4
('XXX', 23438, 3, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (0,)),), {})
23438 second | <function func at 0x1092e60> | <function func at 0x1092e60>
('XXX', 23439, 3, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (1,)),), {})
23439 second | <function func at 0x1092e60> | <function func at 0x1092e60>
------ map-5
('XXX', 23438, 4, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (0,)),), {})
('XXX', 23439, 4, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (1,)),), {})
23438 second | <function func at 0x1092e60> | <function func at 0x1092e60>
23439 second | <function func at 0x1092e60> | <function func at 0x1092e60>
所以要回答 ④,我们需要进一步挖掘多处理源,甚至可能在 pickle 源中。
但我想我对决议的感觉可能是正确的……
然后唯一剩下的问题是 为什么 将标签解析为地址并再次返回标签,然后再将其推送到 children 进程!
编辑:我想我知道为什么了!当我要睡觉的时候,我突然想到了原因,所以我回到了我的键盘:
当 pickle 函数时,pickles 获取包含函数的参数,并从函数的 object 本身获取其名称:
所以即使您确实创建了一个新函数 object,您也会在内存中获得不同的地址:
>>> print(func)
<function func at 0x7fc6174e3ed8>
pickles 不在乎,因为如果 child 无法访问该函数,它将永远无法访问。所以 pickle 只解析 func.__name__
:
>>> print("func.__name__:", func.__name__)
func.__name__: func
>>> print("func2.__name__:", func2.__name__)
func2.__name__: func
然后,即使您在 parent 线程上更改了函数的主体,并且对该函数进行了新的引用,真正被 pickle 的是函数的内部名称,它是在lambda 被赋值或函数被定义。
这解释了为什么在 map-3
阶段将 func2
赋给 pool1
时得到旧的 func
函数。
所以作为结论,因为①map-3
没有找到名字func2
,它在func2
引用的函数中找到了名字func
。因此,这也回答了 ② 和 ③,因为找到的 func
正在执行原始的 func
函数。机制是 func.__name__
用于 pickle 和解析两个进程之间的函数名称,回答 ④.
最后更新,来自您:
在pickle._Pickler.save_global
中,它使用
获取名称
if name is None: name = getattr(obj, '__qualname__', None)
然后再
if name is None: name = obj.__name__.
所以如果obj没有__qualname__
那么会使用__name__
。
However it will check if the object passed is same with the one in subprocess:
if obj2 is not obj: raise PicklingError(...)
其中 obj2, parent = _getattribute(module, name)
.
是的,但请记住,传递的 object 只是函数的(内部)名称,而不是函数本身。 child 进程 没有 方法来确定他的 func()
是否与内存中 parent 的 func()
相同。
来自@SyrtisMajor 的编辑:
好的,让我们更改上面的第一个代码:
import os
from multiprocessing import Pool
print(os.getpid(), 'parent')
def func(i):
print(os.getpid(), 'first', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
print('------ map-1')
pool1 = Pool(2)
pool1.map(func, range(2)) #map-1
def func2(i):
print(os.getpid(), 'second', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
func2.__qualname__ = func.__qualname__
func = func2
print('------ map-2')
pool1.map(func, range(2)) #map-2
print('------ map-3')
pool1.map(func2, range(2)) #map-3
pool2 = Pool(2)
print('------ map-4')
pool2.map(func, range(2)) #map-4
print('------ map-5')
pool2.map(func2, range(2)) #map-5
输出结果如下:
38130 parent
------ map-1
38131 first | <function func at 0x101856f28> | no func2 in globals
38132 first | <function func at 0x101856f28> | no func2 in globals
------ map-2
38131 first | <function func at 0x101856f28> | no func2 in globals
38132 first | <function func at 0x101856f28> | no func2 in globals
------ map-3
38131 first | <function func at 0x101856f28> | no func2 in globals
38132 first | <function func at 0x101856f28> | no func2 in globals
------ map-4
38133 second | <function func at 0x10339b510> | <function func at 0x10339b510>
38134 second | <function func at 0x10339b510> | <function func at 0x10339b510>
------ map-5
38133 second | <function func at 0x10339b510> | <function func at 0x10339b510>
38134 second | <function func at 0x10339b510> | <function func at 0x10339b510>
这和我们的第一个输出完全一样。请注意 func2
定义之后的 func = func2
是关键,因为 pickle 将检查 func2
(名称为 func
)是否与 __main__.func
相同。如果不是,则 pickling 将失败。
我们知道multiprocessing.Pool
必须在函数定义后初始化运行。但是我发现下面的代码对我来说是难以理解的
import os
from multiprocessing import Pool
def func(i): print('first')
pool1 = Pool(2)
pool1.map(func, range(2)) #map-1
def func(i): print('second')
func2 = func
print('------')
pool1.map(func, range(2)) #map-2
pool1.map(func2, range(2)) #map-3
pool2 = Pool(2)
print('------')
pool2.map(func, range(2)) #map-4
pool2.map(func2, range(2)) #map-5
输出(python2.7 和 python3.4 在 linux 上)是
first #map-1
first
------
first #map-2
first
first #map-3
first
------
second #map-4
second
second #map-5
second
map-2
打印出 'first'
正如我们预期的那样。
但是 map-3
是如何找到名字 func2
的呢?我的意思是 pool1
在 func2
第一次出现之前被初始化。所以 func2 = func
确实被执行了,而 def func(i): print('second')
没有。为什么?
如果我直接通过
定义func2def func2(i): print('second')
然后 map-3
将找不到许多帖子提到的名称 func2
,例如。 this one。两种情况有什么区别?
据我所知,参数是通过酸洗传递给从属进程的,但是
pool
如何将调用的函数传递给其他进程?或者子进程如何找到调用的函数?
tl;dr:map-3
处的问题,其中第一个 func
被调用,而人们期望第二个 func
是因为 Pool.map()
使用 pickle 序列化 func.__name__
,即使它被分配给 func2
引用,它也解析为 func
,并被发送到 child 进程,它在本地查找 func
到 child 进程。
好的,所以我可以数出下面列出的四个不同的问题,我认为您已经讲过名称空间和分叉过程,直接进入您的问题的乐趣☺
① But how does map-3 find the name func2?
② So func2 = func is indeed executed, while def func(i): print('second') is not. Why?
③ Then map-3 won't find name func2 as mentioned by many posts, eg. this one. What's the difference between two cases?
④ As I understand the arguments are passed to the slave processes by pickling, but how does pool pass the called function to other processes? Or how do sub-processes find the called function?
所以我添加了更多代码,以展示更多内部结构:
import os
from multiprocessing import Pool
print(os.getpid(), 'parent')
def func(i):
print(os.getpid(), 'first', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
print('------ map-1')
pool1 = Pool(2)
pool1.map(func, range(2)) #map-1
def func(i):
print(os.getpid(), 'second', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
func2 = func
print('------ map-2')
pool1.map(func, range(2)) #map-2
print('------ map-3')
pool1.map(func2, range(2)) #map-3
pool2 = Pool(2)
print('------ map-4')
pool2.map(func, range(2)) #map-4
print('------ map-5')
pool2.map(func2, range(2)) #map-5
我的系统输出:
21512 parent
------ map-1
21513 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
21514 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
------ map-2
21513 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
21514 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
------ map-3
21513 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
21514 first | <function func at 0x7f62d67f7cf8> | no func2 in globals
------ map-4
21518 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
21519 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
------ map-5
21518 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
21519 second | <function func at 0x7f62d531bed8> | <function func at 0x7f62d531bed8>
所以,我们可以看到对于 pool1
,从来没有将 func2
添加到命名空间。所以那里肯定有什么可疑的事情发生,我已经太晚了,无法彻底查看 multiprocessing
的源代码和调试器以了解正在发生的事情。
因此,如果我不得不猜测 ① 的答案,pickle
模块会以某种方式发现 func2
解析为 0x7f62d531bed8
,它已经与标签 [=25] 一起存在=],因此它会在 children 一侧腌制已知的“标签”func
,解析为 0x7f62d67f7cf8
。即:
func2 → 0x7f62d531bed8 → func → [PICKLE] → globals()['func'] → 0x7f62d67f7cf8
为了检验我的理论,我稍微更改了您的代码,将第二个 func()
重命名为 func2()
,这就是我得到的结果:
------ map-3
Process PoolWorker-1:
Process PoolWorker-2:
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
return recv()
AttributeError: 'module' object has no attribute 'func2'
AttributeError: 'module' object has no attribute 'func2'
然后把func = func2
也改成func2 = func
------ map-2
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Process PoolWorker-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
return recv()
AttributeError: 'module' object has no attribute 'func2'
AttributeError: 'module' object has no attribute 'func2'
所以我相信我开始表达观点了。而且,它还显示了在 children 进程方面阅读代码以了解正在发生的事情的位置。
以便更多的线索来回答②和③!
为了更进一步,我在 pool.py
第 114 行中添加了一个打印语句:
job, i, func, args, kwds = task
print("XXX", os.getpid(), job, i, func, args, kwds)
显示正在发生的事情。我们可以看到 func
被解析为 0x7f2d0238fcf8
,这与 parent 函数中的地址相同:
23432 parent
------ map-1
('XXX', 23433, 0, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (0,)),), {})
23433 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
('XXX', 23434, 0, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (1,)),), {})
23434 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
------ map-2
('XXX', 23433, 1, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (0,)),), {})
23433 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
('XXX', 23434, 1, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (1,)),), {})
23434 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
------ map-3
('XXX', 23433, 2, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (0,)),), {})
23433 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
('XXX', 23434, 2, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x7f2d0238fcf8>, (1,)),), {})
23434 first | <function func at 0x7f2d0238fcf8> | no func2 in globals
------ map-4
('XXX', 23438, 3, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (0,)),), {})
23438 second | <function func at 0x1092e60> | <function func at 0x1092e60>
('XXX', 23439, 3, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (1,)),), {})
23439 second | <function func at 0x1092e60> | <function func at 0x1092e60>
------ map-5
('XXX', 23438, 4, 0, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (0,)),), {})
('XXX', 23439, 4, 1, <function mapstar at 0x7f2d02363230>, ((<function func at 0x1092e60>, (1,)),), {})
23438 second | <function func at 0x1092e60> | <function func at 0x1092e60>
23439 second | <function func at 0x1092e60> | <function func at 0x1092e60>
所以要回答 ④,我们需要进一步挖掘多处理源,甚至可能在 pickle 源中。
但我想我对决议的感觉可能是正确的…… 然后唯一剩下的问题是 为什么 将标签解析为地址并再次返回标签,然后再将其推送到 children 进程!
编辑:我想我知道为什么了!当我要睡觉的时候,我突然想到了原因,所以我回到了我的键盘:
当 pickle 函数时,pickles 获取包含函数的参数,并从函数的 object 本身获取其名称:
所以即使您确实创建了一个新函数 object,您也会在内存中获得不同的地址:
>>> print(func)
<function func at 0x7fc6174e3ed8>
pickles 不在乎,因为如果 child 无法访问该函数,它将永远无法访问。所以 pickle 只解析 func.__name__
:
>>> print("func.__name__:", func.__name__)
func.__name__: func
>>> print("func2.__name__:", func2.__name__)
func2.__name__: func
然后,即使您在 parent 线程上更改了函数的主体,并且对该函数进行了新的引用,真正被 pickle 的是函数的内部名称,它是在lambda 被赋值或函数被定义。
这解释了为什么在 map-3
阶段将 func2
赋给 pool1
时得到旧的 func
函数。
所以作为结论,因为①map-3
没有找到名字func2
,它在func2
引用的函数中找到了名字func
。因此,这也回答了 ② 和 ③,因为找到的 func
正在执行原始的 func
函数。机制是 func.__name__
用于 pickle 和解析两个进程之间的函数名称,回答 ④.
最后更新,来自您:
在pickle._Pickler.save_global
中,它使用
if name is None: name = getattr(obj, '__qualname__', None)
然后再
if name is None: name = obj.__name__.
所以如果obj没有__qualname__
那么会使用__name__
。
However it will check if the object passed is same with the one in subprocess:
if obj2 is not obj: raise PicklingError(...)
其中 obj2, parent = _getattribute(module, name)
.
是的,但请记住,传递的 object 只是函数的(内部)名称,而不是函数本身。 child 进程 没有 方法来确定他的 func()
是否与内存中 parent 的 func()
相同。
来自@SyrtisMajor 的编辑:
好的,让我们更改上面的第一个代码:
import os
from multiprocessing import Pool
print(os.getpid(), 'parent')
def func(i):
print(os.getpid(), 'first', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
print('------ map-1')
pool1 = Pool(2)
pool1.map(func, range(2)) #map-1
def func2(i):
print(os.getpid(), 'second', end=" | ")
if 'func' in globals():
print(globals()['func'], end=" | ")
else:
print("no func in globals", end=" | ")
if 'func2' in globals():
print(globals()['func2'])
else:
print("no func2 in globals")
func2.__qualname__ = func.__qualname__
func = func2
print('------ map-2')
pool1.map(func, range(2)) #map-2
print('------ map-3')
pool1.map(func2, range(2)) #map-3
pool2 = Pool(2)
print('------ map-4')
pool2.map(func, range(2)) #map-4
print('------ map-5')
pool2.map(func2, range(2)) #map-5
输出结果如下:
38130 parent
------ map-1
38131 first | <function func at 0x101856f28> | no func2 in globals
38132 first | <function func at 0x101856f28> | no func2 in globals
------ map-2
38131 first | <function func at 0x101856f28> | no func2 in globals
38132 first | <function func at 0x101856f28> | no func2 in globals
------ map-3
38131 first | <function func at 0x101856f28> | no func2 in globals
38132 first | <function func at 0x101856f28> | no func2 in globals
------ map-4
38133 second | <function func at 0x10339b510> | <function func at 0x10339b510>
38134 second | <function func at 0x10339b510> | <function func at 0x10339b510>
------ map-5
38133 second | <function func at 0x10339b510> | <function func at 0x10339b510>
38134 second | <function func at 0x10339b510> | <function func at 0x10339b510>
这和我们的第一个输出完全一样。请注意 func2
定义之后的 func = func2
是关键,因为 pickle 将检查 func2
(名称为 func
)是否与 __main__.func
相同。如果不是,则 pickling 将失败。