在单独的线程中启动异步事件循环并使用队列项

Start asyncio event loop in separate thread and consume queue items

我正在编写一个 Python 程序,该程序 运行 任务同时从队列中取出,以学习 asyncio

项目将通过与主线程交互(在 REPL 内)被放入队列。 每当将任务放入队列时,都应立即使用并执行。 我的方法是启动一个单独的线程并将队列传递给该线程内的事件循环。

这些任务是 运行 宁,但只是按顺序进行,我不清楚如何 运行 同时执行这些任务。我的尝试如下:

import asyncio
import time
import queue
import threading

def do_it(task_queue):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        status = f'{clock()} {name}_{total_time}:'
        print(status, 'START')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(status, 'processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(status, 'DONE.')

    async def main():
        while True:
            item = task_queue.get()
            if item == _sentinel:
                break
            await asyncio.create_task(process(*item))

    print('event loop start')
    asyncio.run(main())
    print('event loop end')


if __name__ == '__main__':
    tasks = queue.Queue()
    th = threading.Thread(target=do_it, args=(tasks,))
    th.start()

    tasks.put(('abc', 5))
    tasks.put(('def', 3))

如果有任何建议可以指导我运行同时执行这些任务,我们将不胜感激!
谢谢

更新
谢谢 Frank Yellin 和 cynthi8!我已经根据你的建议改造了 main():

程序现在按预期工作

更新 2
user4815162342 提供了进一步的改进,我在下面注释了他的建议。

'''
Starts auxiliary thread which establishes a queue and consumes tasks within a
queue.
    
Allow enqueueing of tasks from within __main__ and termination of aux thread
'''
import asyncio
import time
import threading
import functools

def do_it(started):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        print(f'{clock()} {name}_{total_time}:', 'Started.')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(f'{clock()} {name}_{total_time}:', 'Processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(f'{clock()} {name}_{total_time}:', 'Done.')

    async def main():
        # get_running_loop() get the running event loop in the current OS thread
        # out to __main__ thread
        started.loop = asyncio.get_running_loop()
        started.queue = task_queue = asyncio.Queue()
        started.set()
        while True:
            item = await task_queue.get()
            if item == _sentinel:
                # task_done is used to tell join when the work in the queue is 
                # actually finished. A queue length of zero does not mean work
                # is complete.
                task_queue.task_done()
                break
            task = asyncio.create_task(process(*item))
            # Add a callback to be run when the Task is done.
            # Indicate that a formerly enqueued task is complete. Used by queue 
            # consumer threads. For each get() used to fetch a task, a 
            # subsequent call to task_done() tells the queue that the processing
            # on the task is complete.
            task.add_done_callback(lambda _: task_queue.task_done())            

        # keep loop going until all the work has completed
        # When the count of unfinished tasks drops to zero, join() unblocks.
        await task_queue.join()

    print('event loop start')
    asyncio.run(main())
    print('event loop end')

if __name__ == '__main__':
    # started Event is used for communication with thread th
    started = threading.Event()
    th = threading.Thread(target=do_it, args=(started,))
    th.start()
    # started.wait() blocks until started.set(), ensuring that the tasks and
    # loop variables are available from the event loop thread
    started.wait()
    tasks, loop = started.queue, started.loop

    # call_soon schedules the callback callback to be called with args arguments
    # at the next iteration of the event loop.
    # call_soon_threadsafe is required to schedule callbacks from another thread 
    
    # put_nowait enqueues items in non-blocking fashion, == put(block=False)
    loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
    loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
    loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')

你的代码有两个问题。

首先,asyncio.create_task 之前不应该有 await。这可能是导致您的代码同步 运行 的原因。

然后,一旦您编写了异步代码 运行,您需要在 main 中的 while 循环之后添加一些内容,这样代码就不会立即 return,而是等待所有作业完成。另一个Whosebug 推荐:

while len(asyncio.Task.all_tasks()) > 1:  # Any task besides main() itself?
    await asyncio.sleep(0.2)

或者有一些版本的 Queue 可以跟踪 运行ning 任务。

作为附加问题:

如果 queue.Queue 为空,默认情况下 get() 会阻塞并且不会 return 标记字符串。 https://docs.python.org/3/library/queue.html

正如其他人所指出的,您的代码的问题在于它使用了一个阻塞队列,该队列在等待下一项时会暂停事件循环。然而,所提出的解决方案的问题在于它引入了延迟,因为它必须偶尔休眠以允许其他任务 运行。除了引入延迟之外,它还会阻止程序进入休眠状态,即使队列中没有项目也是如此。

另一种方法是切换到 asyncio queue,它专为与 asyncio 一起使用而设计。此队列必须在 运行ning 循环内创建,因此您不能将其传递给 do_it,您必须检索它。此外,由于它是一个 asyncio 原语,因此必须通过 call_soon_threadsafe 调用其 put 方法以确保事件循环注意到它。

最后一个问题是您的 main() 函数使用另一个繁忙的循环来等待所有任务完成。对于此用例,可以通过使用 Queue.join, which is 来避免这种情况。

这是您的代码,根据上述所有建议进行了调整,process 函数与原始代码保持不变:

import asyncio
import time
import threading

def do_it(started):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        status = f'{clock()} {name}_{total_time}:'
        print(status, 'START')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(status, 'processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(status, 'DONE.')

    async def main():
        started.loop = asyncio.get_running_loop()
        started.queue = task_queue = asyncio.Queue()
        started.set()
        while True:
            item = await task_queue.get()
            if item == _sentinel:
                task_queue.task_done()
                break
            task = asyncio.create_task(process(*item))
            task.add_done_callback(lambda _: task_queue.task_done())
        await task_queue.join()

    print('event loop start')
    asyncio.run(main())
    print('event loop end')

if __name__ == '__main__':
    started = threading.Event()
    th = threading.Thread(target=do_it, args=(started,))
    th.start()
    started.wait()
    tasks, loop = started.queue, started.loop

    loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
    loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
    loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')

注意:与您的代码无关的问题是它 等待 create_task() 的结果,这使 create_task() 无效,因为它不是不允许在后台 运行。 (这相当于立即加入你刚刚开始的线程 - 你可以这样做,但它没有多大意义。)这个问题在上面的代码和你对问题的编辑中都得到了修复。