Python:Why 池中的管理器 dict() 显示变量未在多处理中定义?

Python:Why the manager dict() in pool shows the variable is not defined in multiprocessing?

我的代码是这样的:

from multiprocessing import Process, Manager,Pool
station=["A","B","C"]   
def test(k):
    try:
        print(phicc)
    except Exception as E:
        print(E)
        print(station[k])
        


if __name__ == '__main__':
    
      with Manager() as manager:
        phicc=manager.dict()
        for i in station:
           
           phicc[i]=manager.list()
           
        pool = manager.Pool(processes = 10)
        pool.map_async(test,range(len(station)))
        pool.close()
        pool.join()

那么输出就是

name 'phicc' is not defined
A
name 'phicc' is not defined
B
name 'phicc' is not defined
C

我不知道发生了什么事!我需要 phicc 变量可以在 test 函数中被识别!谢谢

您需要为多处理池中的每个进程初始化全局变量phicc ,方法是使用初始化器initargs multiprocessing.pool.Pool 构造函数的参数:

from multiprocessing import Process, Manager, Pool


def init_pool_processes(d):
    global phicc

    phicc = d

station = ["A", "B", "C"]

def test(k):
    try:
        print(phicc)
    except Exception as E:
        print(E)
        print(station[k])


if __name__ == '__main__':

    with Manager() as manager:
        phicc = manager.dict()
        for i in station:
            phicc[i] = manager.list()

        pool = manager.Pool(processes=10, initializer=init_pool_processes, initargs=(phicc,))
        pool.map_async(test, range(len(station)))
        pool.close()
        pool.join()

打印:

{'A': <ListProxy object, typeid 'list' at 0x1e2df934eb0>, 'B': <ListProxy object, typeid 'list' at 0x1e2df940130>, 'C': <ListProxy object, typeid 'list' at 0x1e2df940310>}
{'A': <ListProxy object, typeid 'list' at 0x1e2df934eb0>, 'B': <ListProxy object, typeid 'list' at 0x1e2df940130>, 'C': <ListProxy object, typeid 'list' at 0x1e2df940310>}
{'A': <ListProxy object, typeid 'list' at 0x1e2df934eb0>, 'B': <ListProxy object, typeid 'list' at 0x1e2df940130>, 'C': <ListProxy object, typeid 'list' at 0x1e2df940310>}

备注

我觉得您使用 manager.Pool 调用创建的 Pool 代理而不是创建 multiprocessing.pool.Pool class 有点奇怪,因为您发布的代码显然不需要为了那个原因。如果您使用“标准”池并且如果您在 Linux 下 运行 或使用 OS fork 创建新进程的其他平台,那么严格来说您不会为每个创建的新进程使用上面示例中的池初始化程序将继承主进程的全局变量(但是,它们不是“可共享的”,也就是说,如果子进程修改全局变量,它将是对自己的本地副本)。所以下面的代码可以在 Linux:

下运行
from multiprocessing import Process, Manager, Pool


station = ["A", "B", "C"]

def test(k):
    try:
        print(phicc)
    except Exception as E:
        print(E)
        print(station[k])


if __name__ == '__main__':

    with Manager() as manager:
        phicc = manager.dict()
        for i in station:
            phicc[i] = manager.list()

        # A multiprocessing.pool.Pool instance:
        pool = Pool(processes=10)
        pool.map_async(test, range(len(station)))
        pool.close()
        pool.join()

打印:

{'A': <ListProxy object, typeid 'list' at 0x7f8c08a663d0>, 'B': <ListProxy object, typeid 'list' at 0x7f8c08a664c0>, 'C': <ListProxy object, typeid 'list' at 0x7f8c08a662e0>}
{'A': <ListProxy object, typeid 'list' at 0x7f8c08a663d0>, 'B': <ListProxy object, typeid 'list' at 0x7f8c08a664c0>, 'C': <ListProxy object, typeid 'list' at 0x7f8c08a662e0>}
{'A': <ListProxy object, typeid 'list' at 0x7f8c08a663d0>, 'B': <ListProxy object, typeid 'list' at 0x7f8c08a664c0>, 'C': <ListProxy object, typeid 'list' at 0x7f8c08a662e0>}

这就是为什么在发布带有 multiprocessing 标记的问题时,您还需要使用您的平台标记您的问题,例如 windowslinux,因为有不同的解决方案。发布的使用池初始化程序的代码将适用于 Windows 和 Linux,但如果您的平台是 Linux,上面仅 Linux 的代码会更有效率。