当我们需要在进程之间共享状态时,如何控制 Python 中的内核和处理器数量

How to control number of cores and processors in Python when we need to share state between processes

document中,展示了一个关于如何在进程之间共享状态的例子。由于时间关系,我post下面文档中的代码。

from multiprocessing import Process, Value, Array

def f(n, a):
    n.value = 3.1415927
    for i in range(len(a)):
        a[i] = -a[i]

if __name__ == '__main__':
    num = Value('d', 0.0)
    arr = Array('i', range(10))

    p = Process(target=f, args=(num, arr))
    p.start()
    p.join()

    print num.value
    print arr[:]

我正在 Linux 超级计算系统上编写代码,我可以在其中分配固定数量的内核和每个节点的多个内核。那么如何编写代码来分配这个框架中的工人呢? Python 会自动充分利用内核吗?为了在这个框架中充分利用竞争资源,分配参数的正确方法是什么?

可能的答案:

from multiprocessing import Process, Value, Array, Pool
import random

def f(n, a):
    n.value = 3.1415927
    a[random.randint(5)] = 0

if __name__ == '__main__':
    num = Value('d', 0.0)
    arr = Array('i', range(10))

    # you can manually set the number of workers
    workers = 5
    # or, if you want to use all cores, 
    # use the maximum number of existing cores to set workers
    # workers = cpu_count()
    p = Pool(workers)
    results = p.map(f, (num, arr))

    print num.value
    print arr[:]

在上面的代码中,最后的打印语句应该打印 3.1415927[0,0,0,0,0,6,7,8,9].

如果您想使用流程,您必须手动设置和管理每个流程的开始和结束:

from multiprocessing import Process, Value, Array, Pool
import random

def f(n, a):
    n.value = 3.1415927
    a[random.randint(5)] = 0

if __name__ == '__main__':
    num = Value('d', 0.0)
    arr = Array('i', range(10))

    # if you want to use all cores, 
    # use the maximum number of existing cores to set workers
    workers = cpu_count()
    prcs = [Process(target=f, args=(num, arr)).start() for i in range(workers)]
    results = [i.join() for i in prcs]

    print num.value
    print arr[:]