如何解决 Python 中的 ConnectionError (RemoteDisconnected)?
How to solve ConnectionError (RemoteDisconnected) in Python?
我正在尝试抓取 https://gmatclub.com/forum/decision-tracker.html 并且我能够获得我想要的大部分内容,但有时我会遇到 ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
。
如何解决?
我的代码是:
import requests
link = 'https://gmatclub.com/api/schools/v1/forum/app-tracker-latest-updates'
params = {
'limit': 500,
'offset': 0,
'year': 'all'
}
with requests.Session() as con:
con.headers["User-Agent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.86 YaBrowser/21.3.0.740 Yowser/2.5 Safari/537.36"
con.get("https://gmatclub.com/forum/decision-tracker.html")
while True:
endpoint = con.get(link,params=params).json()
if not endpoint["statistics"]:break
for item in endpoint["statistics"]:
print(item['school_title'])
params['offset']+=499
一种策略是重复请求,直到您从服务器获得正确的响应,例如:
import requests
from time import sleep
link = "https://gmatclub.com/api/schools/v1/forum/app-tracker-latest-updates"
params = {"limit": 500, "offset": 0, "year": "all"}
with requests.Session() as con:
con.headers[
"User-Agent"
] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.86 YaBrowser/21.3.0.740 Yowser/2.5 Safari/537.36"
con.get("https://gmatclub.com/forum/decision-tracker.html")
while True:
# repeat until we got correct response from server:
while True:
try:
endpoint = con.get(link, params=params).json()
break
except requests.exceptions.ConnectionError:
sleep(3) # wait a little bit and try again
continue
if not endpoint["statistics"]:
break
for item in endpoint["statistics"]:
print(item["school_title"])
params["offset"] += 499
我正在尝试抓取 https://gmatclub.com/forum/decision-tracker.html 并且我能够获得我想要的大部分内容,但有时我会遇到 ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
。
如何解决?
我的代码是:
import requests
link = 'https://gmatclub.com/api/schools/v1/forum/app-tracker-latest-updates'
params = {
'limit': 500,
'offset': 0,
'year': 'all'
}
with requests.Session() as con:
con.headers["User-Agent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.86 YaBrowser/21.3.0.740 Yowser/2.5 Safari/537.36"
con.get("https://gmatclub.com/forum/decision-tracker.html")
while True:
endpoint = con.get(link,params=params).json()
if not endpoint["statistics"]:break
for item in endpoint["statistics"]:
print(item['school_title'])
params['offset']+=499
一种策略是重复请求,直到您从服务器获得正确的响应,例如:
import requests
from time import sleep
link = "https://gmatclub.com/api/schools/v1/forum/app-tracker-latest-updates"
params = {"limit": 500, "offset": 0, "year": "all"}
with requests.Session() as con:
con.headers[
"User-Agent"
] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.86 YaBrowser/21.3.0.740 Yowser/2.5 Safari/537.36"
con.get("https://gmatclub.com/forum/decision-tracker.html")
while True:
# repeat until we got correct response from server:
while True:
try:
endpoint = con.get(link, params=params).json()
break
except requests.exceptions.ConnectionError:
sleep(3) # wait a little bit and try again
continue
if not endpoint["statistics"]:
break
for item in endpoint["statistics"]:
print(item["school_title"])
params["offset"] += 499