如何解决 Python 中的 ConnectionError (RemoteDisconnected)?

How to solve ConnectionError (RemoteDisconnected) in Python?

我正在尝试抓取 https://gmatclub.com/forum/decision-tracker.html 并且我能够获得我想要的大部分内容,但有时我会遇到 ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

如何解决?

我的代码是:

import requests

link = 'https://gmatclub.com/api/schools/v1/forum/app-tracker-latest-updates'
params = {
    'limit': 500,
    'offset': 0,
    'year': 'all'
}

with requests.Session() as con:
    con.headers["User-Agent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.86 YaBrowser/21.3.0.740 Yowser/2.5 Safari/537.36"
    con.get("https://gmatclub.com/forum/decision-tracker.html")
    while True:
        endpoint = con.get(link,params=params).json()
        if not endpoint["statistics"]:break
        for item in endpoint["statistics"]:
            print(item['school_title'])

        params['offset']+=499

一种策略是重复请求,直到您从服务器获得正确的响应,例如:

import requests
from time import sleep

link = "https://gmatclub.com/api/schools/v1/forum/app-tracker-latest-updates"
params = {"limit": 500, "offset": 0, "year": "all"}

with requests.Session() as con:
    con.headers[
        "User-Agent"
    ] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.86 YaBrowser/21.3.0.740 Yowser/2.5 Safari/537.36"
    con.get("https://gmatclub.com/forum/decision-tracker.html")
    while True:

        # repeat until we got correct response from server:
        while True:
            try:
                endpoint = con.get(link, params=params).json()
                break
            except requests.exceptions.ConnectionError:
                sleep(3)  # wait a little bit and try again
                continue

        if not endpoint["statistics"]:
            break
        for item in endpoint["statistics"]:
            print(item["school_title"])

        params["offset"] += 499