发送请求时会话超时
Timeout within session while sending requests
我正在尝试了解如何在发送请求时在会话中使用 timeout
。我在下面尝试过的方法可以获取网页的内容,但我不确定这是正确的方法,因为我在这个 documentation.[=14= 中找不到 timeout
的用法]
import requests
link = "https://whosebug.com/questions/tagged/web-scraping"
with requests.Session() as s:
r = s.get(link,timeout=5)
print(r.text)
如何在会话中使用超时?
根据Documentation - Quick Start。
You can tell Requests to stop waiting for a response after a given
number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests.
requests.get('https://github.com/', timeout=0.001)
或者从 Documentation Advanced Usage 您可以设置 2 个值(connect 和 read超时)
The timeout value will be applied to both the connect and the read
timeouts. Specify a tuple if you would like to set the values
separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
使会话范围超时
在整个文档中进行了搜索,发现不可能在整个会话范围内设置 timeout 参数。
但是有一个 GitHub Issue Opened (Consider making Timeout option required or have a default) which provides a workaround 作为 HTTPAdapter
你可以这样使用:
import requests
from requests.adapters import HTTPAdapter
class TimeoutHTTPAdapter(HTTPAdapter):
def __init__(self, *args, **kwargs):
if "timeout" in kwargs:
self.timeout = kwargs["timeout"]
del kwargs["timeout"]
super().__init__(*args, **kwargs)
def send(self, request, **kwargs):
timeout = kwargs.get("timeout")
if timeout is None and hasattr(self, 'timeout'):
kwargs["timeout"] = self.timeout
return super().send(request, **kwargs)
并安装在requests.Session()
s = requests.Session()
s.mount('http://', TimeoutHTTPAdapter(5)) # 5 seconds
s.mount('https://', TimeoutHTTPAdapter(5))
...
r = s.get(link)
print(r.text)
或者类似地,您可以使用@GordonAitchJay
提议的
with EnhancedSession(5) as s: # 5 seconds
r = s.get(link)
print(r.text)
I'm not sure this is the right way as I could not find the usage of timeout
in this documentation.
滚动到底部。它肯定在那里。可以在页面中按Ctrl+F输入timeout
.
进行搜索
您在代码示例中正确使用了 timeout
。
您实际上可以通过几种不同的方式指定超时,如 documentation:
中所述
If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
The timeout value will be applied to both the connect
and the read
timeouts. Specify a tuple if you would like to set the values separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
r = requests.get('https://github.com', timeout=None)
尝试使用 https://httpstat.us/200?sleep=5000
来测试您的代码。
例如,这会引发异常,因为 0.2 秒不足以与服务器建立连接:
import requests
link = "https://httpstat.us/200?sleep=5000"
with requests.Session() as s:
try:
r = s.get(link, timeout=(0.2, 10))
print(r.text)
except requests.exceptions.Timeout as e:
print(e)
输出:
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=0.2)
这会引发异常,因为服务器在发送响应之前等待 5 秒,这比设置的 2 秒 read
超时长:
import requests
link = "https://httpstat.us/200?sleep=5000"
with requests.Session() as s:
try:
r = s.get(link, timeout=(3.05, 2))
print(r.text)
except requests.exceptions.Timeout as e:
print(e)
输出:
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=2)
您特别提到在会话中使用超时。所以也许你想要一个有默认超时的会话对象。像这样:
import requests
link = "https://httpstat.us/200?sleep=5000"
class EnhancedSession(requests.Session):
def __init__(self, timeout=(3.05, 4)):
self.timeout = timeout
return super().__init__()
def request(self, method, url, **kwargs):
print("EnhancedSession request")
if "timeout" not in kwargs:
kwargs["timeout"] = self.timeout
return super().request(method, url, **kwargs)
session = EnhancedSession()
try:
response = session.get(link)
print(response)
except requests.exceptions.Timeout as e:
print(e)
try:
response = session.get(link, timeout=1)
print(response)
except requests.exceptions.Timeout as e:
print(e)
try:
response = session.get(link, timeout=10)
print(response)
except requests.exceptions.Timeout as e:
print(e)
输出:
EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=4)
EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=1)
EnhancedSession request
<Response [200]>
我正在尝试了解如何在发送请求时在会话中使用 timeout
。我在下面尝试过的方法可以获取网页的内容,但我不确定这是正确的方法,因为我在这个 documentation.[=14= 中找不到 timeout
的用法]
import requests
link = "https://whosebug.com/questions/tagged/web-scraping"
with requests.Session() as s:
r = s.get(link,timeout=5)
print(r.text)
如何在会话中使用超时?
根据Documentation - Quick Start。
You can tell Requests to stop waiting for a response after a given number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests.
requests.get('https://github.com/', timeout=0.001)
或者从 Documentation Advanced Usage 您可以设置 2 个值(connect 和 read超时)
The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
使会话范围超时
在整个文档中进行了搜索,发现不可能在整个会话范围内设置 timeout 参数。
但是有一个 GitHub Issue Opened (Consider making Timeout option required or have a default) which provides a workaround 作为 HTTPAdapter
你可以这样使用:
import requests
from requests.adapters import HTTPAdapter
class TimeoutHTTPAdapter(HTTPAdapter):
def __init__(self, *args, **kwargs):
if "timeout" in kwargs:
self.timeout = kwargs["timeout"]
del kwargs["timeout"]
super().__init__(*args, **kwargs)
def send(self, request, **kwargs):
timeout = kwargs.get("timeout")
if timeout is None and hasattr(self, 'timeout'):
kwargs["timeout"] = self.timeout
return super().send(request, **kwargs)
并安装在requests.Session()
s = requests.Session()
s.mount('http://', TimeoutHTTPAdapter(5)) # 5 seconds
s.mount('https://', TimeoutHTTPAdapter(5))
...
r = s.get(link)
print(r.text)
或者类似地,您可以使用@GordonAitchJay
提议的with EnhancedSession(5) as s: # 5 seconds
r = s.get(link)
print(r.text)
I'm not sure this is the right way as I could not find the usage of
timeout
in this documentation.
滚动到底部。它肯定在那里。可以在页面中按Ctrl+F输入timeout
.
您在代码示例中正确使用了 timeout
。
您实际上可以通过几种不同的方式指定超时,如 documentation:
中所述If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
The timeout value will be applied to both the
connect
and theread
timeouts. Specify a tuple if you would like to set the values separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
r = requests.get('https://github.com', timeout=None)
尝试使用 https://httpstat.us/200?sleep=5000
来测试您的代码。
例如,这会引发异常,因为 0.2 秒不足以与服务器建立连接:
import requests
link = "https://httpstat.us/200?sleep=5000"
with requests.Session() as s:
try:
r = s.get(link, timeout=(0.2, 10))
print(r.text)
except requests.exceptions.Timeout as e:
print(e)
输出:
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=0.2)
这会引发异常,因为服务器在发送响应之前等待 5 秒,这比设置的 2 秒 read
超时长:
import requests
link = "https://httpstat.us/200?sleep=5000"
with requests.Session() as s:
try:
r = s.get(link, timeout=(3.05, 2))
print(r.text)
except requests.exceptions.Timeout as e:
print(e)
输出:
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=2)
您特别提到在会话中使用超时。所以也许你想要一个有默认超时的会话对象。像这样:
import requests
link = "https://httpstat.us/200?sleep=5000"
class EnhancedSession(requests.Session):
def __init__(self, timeout=(3.05, 4)):
self.timeout = timeout
return super().__init__()
def request(self, method, url, **kwargs):
print("EnhancedSession request")
if "timeout" not in kwargs:
kwargs["timeout"] = self.timeout
return super().request(method, url, **kwargs)
session = EnhancedSession()
try:
response = session.get(link)
print(response)
except requests.exceptions.Timeout as e:
print(e)
try:
response = session.get(link, timeout=1)
print(response)
except requests.exceptions.Timeout as e:
print(e)
try:
response = session.get(link, timeout=10)
print(response)
except requests.exceptions.Timeout as e:
print(e)
输出:
EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=4)
EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=1)
EnhancedSession request
<Response [200]>