控制多处理异步中的作业数量 python

Control the amount of jobs in multiprocessing async python

我正在尝试制作一个脚本,一次只使用 4 个进程,并在返回值后启动另一个进程。我认为一些问题是 results.get 一直等到它得到一个结果,并且在返回一个值之前不会继续。我希望在等待结果时继续 While 循环。

https://docs.python.org/2.7/library/multiprocessing.html#multiprocessing.pool.AsyncResult.get

import multiprocessing as mp
import time
from random import randint


def print_hello(VPN_Name):
    time.sleep(randint(0,5))
    return VPN_Name


VPN_list = ['3048-VPN01', '3049-VPN01', '3051-VPN01', '3053-VPN01', '3058-VPN01', '3059-VPN01', '3061-MULTI01', '3063-VPN01', '3065-VPN01', '3066-MULTI01', '3067-VPN01', '3069-VPN01', '3071-VPN01', '3072-VPN01']

VPN_len = len(VPN_list)
x = 0
pool = mp.Pool(processes=4)

job_tracker = []
complete_tracker = []


while True:
    for VPN_Name in VPN_list:
        if VPN_len == 0:
            break
        while True:
            print "Complete Job Tracker ", complete_tracker
            print "Job Tracker ", job_tracker
            for comp in complete_tracker:
                if comp in job_tracker:
                    x = x - 1
                    job_tracker.remove(comp)
                    print "Confirmed complete " + comp
                continue
            if x < 4:
                results = pool.apply_async(print_hello,args=(VPN_Name,))
                VPN_len = VPN_len - 1
                x = x + 1
                print "Started  " + VPN_Name
                job_tracker.append(VPN_Name)
                complete_tracker.append(results.get())
                break
            continue

你的循环不起作用,因为 results.get 阻塞直到结果可用,这实际上使你的代码不并行。您似乎正在尝试做很多额外的工作来获得 multiprocessing.Pool 自动为您提供的功能。

当您执行 pm.Pool(4) 时,您会创建一个包含 4 个进程的池,因此当您将许多任务传递给该池时,它将一次执行 4 个,直到它们全部完成。甚至还有一些函数可以将一组输入提交到池中,因此您不必自己进行迭代。

这让您可以将整个 while 循环替换为:

pool = mp.Pool(processes=4)

results = pool.map(print_hello, VPN_list)

for result in results:
    print "Confirmed complete " + result

这将在 pool.map until all the tasks are complete and then return them all at once in the order that you submitted them. If you want them to return as they are completed (but still in order), you can use pool.imap, and if you don't care about order but just want results as soon as they are available, use pool.imap_unordered

阻塞