python 多个函数的全局变量错误

python global variable error with multiple functions

我这里有一个使用全局变量的示例代码,它给我错误。在调用test2函数之前,在test3函数中声明了全局变量x,但是test2函数似乎没有得到全局变量[=17=的定义]

from multiprocessing import Pool
import numpy as np

global x    

def test1(w, y):
    return w+y    

def test2(v):
    global x        # x is assigned value in test3 before test2 is called
    return test1(x, v)    

def test3():
    global x
    x = 2
    y = np.random.random(10)
    with Pool(processes=6) as p:
        z = p.map(test2, y)
    print(z)

if __name__ == '__main__':
    test3()

错误是:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
  File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
  File "...\my_global_variable_testcode.py", line 23, in test2
return test1(x, v)
NameError: name 'x' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "...\my_global_variable_testcode.py", line 35, in <module>
test3()
  File "...\my_global_variable_testcode.py", line 31, in test3
z = p.map(test2, y)
  File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 608, in get
raise self._value
NameError: name 'x' is not defined

我看了很多关于 SO 的问题和答案,但仍然无法弄清楚如何修复这段代码。如果有人能指出代码有什么问题,将不胜感激?

谁能告诉我如何重写上面的代码,而不改变代码的基本结构(即保留 test1test2test3 作为 3 个独立的函数,如我的原始代码(这些功能很长很复杂),这样我就可以实现多处理的目标?

p.s。这个示例代码只是我实际代码的简化版本,我在这里给出这个简化版本是为了弄清楚如何使全局变量工作(而不是试图为 2+np.random.random(10) 寻找复杂的方法)。

* 编辑 * - 悬赏说明

此赏金是为了帮助我重新编写此代码,保留代码中函数的基本结构的人:

(i) test1test2 进行多处理调用,test2 依次调用 test3

(ii) 使用全局变量或多处理模块的管理器 class 或其他任何东西来避免 test1 将公共变量传递给 test2

(iii) test1 在调用多处理代码之前也给出一些值或对全局变量/公共数据进行更改

(iv) 代码应该在 Windows 上工作(因为我正在使用 Windows)。目前不寻找适用于 Linux / OSX 的解决方案。

为了帮助赏金,让我给出两个不同的测试用例。

* 案例 1 - 非多处理版本 *

import numpy as np

x = 3

def test1(w, y):
    return w+y

def test2(v):
    global x
    print('x in test2 = ', x)
    return test1(x, v)

def test3():
    global x
    x = 2
    print('x in test3 = ', x)
    y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
    z = test2(y)
    print(z)

if __name__ == '__main__':
    test3()

输出(正确)是:

x in test3 =  2
x in test2 =  2
[ 3  4  5  6  7  8  9 10 11 12]

* 案例 2 - 多处理版本 *

from multiprocessing import Pool
import numpy as np

x = 3

def test1(w, y):
    return w+y

def test2(v):
    global x
    print('x in test2 = ', x)
    return test1(x, v)

def test3():
    global x
    x = 2
    print('x in test3 = ', x)
    y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
    with Pool(processes=6) as p:
        z = p.map(test2, y)
    print(z)

if __name__ == '__main__':
    test3()

输出(不正确)是

x in test3 =  2
x in test2 =  3
x in test2 =  3
x in test2 =  3
x in test2 =  3
x in test2 =  3
x in test2 =  3
x in test2 =  3
x in test2 =  3
x in test2 =  3
x in test2 =  3
[4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

您必须在函数外部定义变量 x,例如,而不是全局 x,比如 x = 0 或您喜欢的任何内容,并在函数中使用全局声明,就像您现在所做的那样。 希望有帮助

您的问题是您在进程中而不是在多进程池中共享变量。当您使用 global x 时,它可以在单个进程中工作,但不能跨多个进程工作。在这种情况下,您需要使用 multiprocessing 中的 Value。下面是更新的代码,适用于 multiprocessing

from multiprocessing import Pool, Value
import numpy as np

xVal = Value('i', 0)

def test1(w, y):
    return w+y

def test2(v):
    x = xVal.value
    print('x in test2 = ', x)
    return test1(x, v)

def test3():
    xVal.value = 2

    print('x in test3 = ', xVal.value)
    y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
    with Pool(processes=6) as p:
        z = p.map(test2, y)
    print(z)

if __name__ == '__main__':
    test3()

程序输出如下

x in test3 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
x in test2 =  2
[3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

编辑-2

下面的程序也应该在 Windows 上运行

from multiprocessing import Pool, Value, Manager, Array
import multiprocessing
import numpy as np

xVal = None

def sharedata(sharedData):
    global xVal
    xVal = sharedData

def test1(w, y):
    return w+y

def test2(v):
    global xVal
    x = xVal.value
    print('x in test2 = ', x)
    return test1(x, v)


def test3():
    xVal.value = 2
    print('x in test3 = ', xVal.value)
    y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
    with Pool(processes=6, initializer=sharedata,initargs=(xVal,)) as p:
        z = p.map(test2, y)
    print('x in test3 = ', xVal.value)
    print(z)

if __name__ == '__main__':
    xVal = Value('i', 0)
    test3()