为什么 asyncio 不总是使用执行器？

Question

我要发送很多HTTP请求，一旦所有请求都返回，程序就可以继续了。听起来非常适合 asyncio。有点天真，我将对 requests 的调用包装在一个 async 函数中，并将它们交给 asyncio。这是行不通的。

网上搜索后，找到了两种解决方案：

使用像 aiohttp 这样的库，它可以与 asyncio
在对 run_in_executor

为了更好地理解这一点，我写了一个小的基准测试。服务器端是一个烧瓶程序，在响应请求之前等待 0.1 秒。

from flask import Flask
import time

app = Flask(__name__)


@app.route('/')
def hello_world():
    time.sleep(0.1) // heavy calculations here :)
    return 'Hello World!'


if __name__ == '__main__':
    app.run()

客户是我的标杆

import requests
from time import perf_counter, sleep

# this is the baseline, sequential calls to requests.get
start = perf_counter()
for i in range(10):
    r = requests.get("http://127.0.0.1:5000/")
stop = perf_counter()
print(f"synchronous took {stop-start} seconds") # 1.062 secs

# now the naive asyncio version
import asyncio
loop = asyncio.get_event_loop()

async def get_response():
    r = requests.get("http://127.0.0.1:5000/")

start = perf_counter()
loop.run_until_complete(asyncio.gather(*[get_response() for i in range(10)]))
stop = perf_counter()
print(f"asynchronous took {stop-start} seconds") # 1.049 secs

# the fast asyncio version
start = perf_counter()
loop.run_until_complete(asyncio.gather(
    *[loop.run_in_executor(None, requests.get, 'http://127.0.0.1:5000/') for i in range(10)]))
stop = perf_counter()
print(f"asynchronous (executor) took {stop-start} seconds") # 0.122 secs

#finally, aiohttp
import aiohttp

async def get_response(session):
    async with session.get("http://127.0.0.1:5000/") as response:
        return await response.text()

async def main():
    async with aiohttp.ClientSession() as session:
        await get_response(session)

start = perf_counter()
loop.run_until_complete(asyncio.gather(*[main() for i in range(10)]))
stop = perf_counter()
print(f"aiohttp took {stop-start} seconds") # 0.121 secs

因此，asyncio 的直观实现不处理阻塞 io 代码。但是如果你正确使用asyncio，它和特殊的aiohttp框架一样快。 coroutines and tasks don't really mention this. Only if you read up on the loop.run_in_executor() 的文档说：

# File operations (such as logging) can block the
# event loop: run them in a thread pool.

我对这种行为感到惊讶。 asyncio 的目的是加速阻塞 io 调用。为什么需要额外的包装器 run_in_executor 来执行此操作？

aiohttp 的整个卖点似乎是对 asyncio 的支持。但据我所知，requests 模块工作得很好——只要你把它包装在一个执行程序中。是否有理由避免在执行程序中包装某些东西？

Answer 1

But as far as I can see, the requests module works perfectly - as long as you wrap it in an executor. Is there a reason to avoid wrapping something in an executor ?

运行执行器中的代码意味着运行它在 OS threads.

aiohttp 和类似的库允许运行非阻塞代码而无需 OS 线程，仅使用协程。

如果您没有太多工作，OS 线程和协程之间的差异并不显着，尤其是与瓶颈 - I/O 操作相比。但是一旦你有很多工作，你会注意到 OS 线程由于昂贵的 context switching.

而表现相对较差

例如，当我将您的代码更改为time.sleep(0.001)和range(100)时，我的机器显示：

asynchronous (executor) took 0.21461606299999997 seconds
aiohttp took 0.12484742700000007 seconds

而且这种差异只会根据请求的数量增加。

The purpose of asyncio is to speed up blocking io calls.

不，asyncio的目的是提供方便的方式来控制执行流程。 asyncio 允许您选择流程的工作方式 - 基于协程和 OS 线程（当您使用执行程序时）或纯协程（如 aiohttp 那样）。

aiohttp 的目的是加快速度，它处理如上所示的任务:)

为什么 asyncio 不总是使用执行器？

Why doesn't asyncio always use executors?

python

coroutine

python-requests

python-asyncio

aiohttp