如何使用 asyncio 在 Python 3 中异步 运行 requests.get?
How to run requests.get asynchronously in Python 3 using asyncio?
我正在尝试创建简单的 Web 监控脚本,它会定期和异步地向列表中的 url 发送 GET 请求。这是我的请求函数:
def request(url,timeout=10):
try:
response = requests.get(url,timeout=timeout)
response_time = response.elapsed.total_seconds()
if response.status_code in (404,500):
response.raise_for_status()
html_response = response.text
soup = BeautifulSoup(html_response,'lxml')
# process page here
logger.info("OK {}. Response time: {} seconds".format(url,response_time))
except requests.exceptions.ConnectionError:
logger.error('Connection error. {} is down. Response time: {} seconds'.format(url,response_time))
except requests.exceptions.Timeout:
logger.error('Timeout. {} not responding. Response time: {} seconds'.format(url,response_time))
except requests.exceptions.HTTPError:
logger.error('HTTP Error. {} returned status code {}. Response time: {} seconds'.format(url,response.status_code, response_time))
except requests.exceptions.TooManyRedirects:
logger.error('Too many redirects for {}. Response time: {} seconds'.format(url,response_time))
except:
logger.error('Content requirement not found for {}. Response time: {} seconds'.format(url,response_time))
在这里我为所有 url 调用此函数:
def async_requests(delay,urls):
for url in urls:
async_task = make_async(request,delay,url,10)
loop.call_soon(delay,async_task)
try:
loop.run_forever()
finally:
loop.close()
delay
参数是循环的间隔,它描述了函数需要执行的频率。为了循环 request
我创建了这样的东西:
def make_async(func,delay,*args,**kwargs):
def wrapper(*args, **kwargs):
func(*args, **kwargs)
loop.call_soon(delay, wrapper)
return wrapper
每次我执行 async_requests
我都会为每个 url:
得到这个错误
Exception in callback 1.0(<function mak...x7f1d48dd1730>)
handle: <Handle 1.0(<function mak...x7f1d48dd1730>)>
Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
self._callback(*self._args)
TypeError: 'float' object is not callable
每个 url 的 request
功能也没有按预期定期执行。我的 async_requests
之后的打印功能也没有执行:
async_requests(args.delay,urls)
print("Starting...")
我知道我在代码中做错了什么,但我不知道如何解决这个问题。我是 python 的初学者,对 asyncio 不是很有经验。
总结我想要实现的目标:
- 运行 异步和定期
request
特定 url 不阻塞主线程。
- 运行
async_requests
异步所以我可以启动一个简单的 http 服务器
例如在同一个线程中。
except:
它还会捕获服务异常行 KeyboardInterrupt
或 StopIteration
。永远不要做这样的事情。而是写:
except Exception:
How to run requests.get asynchronously in Python 3 using asyncio?
requests.get
本质上是阻塞的。
您应该为 aiohttp
模块这样的请求找到异步替代方案:
async def get(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
return await resp.text()
或 运行 requests.get
在单独的线程中并使用 loop.run_in_executor
:
等待此线程异步
executor = ThreadPoolExecutor(2)
async def get(url):
response = await loop.run_in_executor(executor, requests.get, url)
return response.text
我正在尝试创建简单的 Web 监控脚本,它会定期和异步地向列表中的 url 发送 GET 请求。这是我的请求函数:
def request(url,timeout=10):
try:
response = requests.get(url,timeout=timeout)
response_time = response.elapsed.total_seconds()
if response.status_code in (404,500):
response.raise_for_status()
html_response = response.text
soup = BeautifulSoup(html_response,'lxml')
# process page here
logger.info("OK {}. Response time: {} seconds".format(url,response_time))
except requests.exceptions.ConnectionError:
logger.error('Connection error. {} is down. Response time: {} seconds'.format(url,response_time))
except requests.exceptions.Timeout:
logger.error('Timeout. {} not responding. Response time: {} seconds'.format(url,response_time))
except requests.exceptions.HTTPError:
logger.error('HTTP Error. {} returned status code {}. Response time: {} seconds'.format(url,response.status_code, response_time))
except requests.exceptions.TooManyRedirects:
logger.error('Too many redirects for {}. Response time: {} seconds'.format(url,response_time))
except:
logger.error('Content requirement not found for {}. Response time: {} seconds'.format(url,response_time))
在这里我为所有 url 调用此函数:
def async_requests(delay,urls):
for url in urls:
async_task = make_async(request,delay,url,10)
loop.call_soon(delay,async_task)
try:
loop.run_forever()
finally:
loop.close()
delay
参数是循环的间隔,它描述了函数需要执行的频率。为了循环 request
我创建了这样的东西:
def make_async(func,delay,*args,**kwargs):
def wrapper(*args, **kwargs):
func(*args, **kwargs)
loop.call_soon(delay, wrapper)
return wrapper
每次我执行 async_requests
我都会为每个 url:
Exception in callback 1.0(<function mak...x7f1d48dd1730>)
handle: <Handle 1.0(<function mak...x7f1d48dd1730>)>
Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
self._callback(*self._args)
TypeError: 'float' object is not callable
每个 url 的 request
功能也没有按预期定期执行。我的 async_requests
之后的打印功能也没有执行:
async_requests(args.delay,urls)
print("Starting...")
我知道我在代码中做错了什么,但我不知道如何解决这个问题。我是 python 的初学者,对 asyncio 不是很有经验。 总结我想要实现的目标:
- 运行 异步和定期
request
特定 url 不阻塞主线程。 - 运行
async_requests
异步所以我可以启动一个简单的 http 服务器 例如在同一个线程中。
except:
它还会捕获服务异常行 KeyboardInterrupt
或 StopIteration
。永远不要做这样的事情。而是写:
except Exception:
How to run requests.get asynchronously in Python 3 using asyncio?
requests.get
本质上是阻塞的。
您应该为 aiohttp
模块这样的请求找到异步替代方案:
async def get(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
return await resp.text()
或 运行 requests.get
在单独的线程中并使用 loop.run_in_executor
:
executor = ThreadPoolExecutor(2)
async def get(url):
response = await loop.run_in_executor(executor, requests.get, url)
return response.text