使用 "wait_on_rate_limit" 参数获取背靠背错误
Getting back to back error using "wait_on_rate_limit" parameter
为了避免速率限制错误我使用了参数:
wait_on_rate_limit
函数中
api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)
起初我的程序运行良好。当我超过速率限制时,我收到消息:
"Rate limit reached. Sleeping for: 909"。我的程序休眠了这段时间,然后我的程序继续收集数据。在某些时候,我遇到了一些背靠背的错误。
...
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
...
urllib3.exceptions.ProtocolError: ('Connection aborted.',
ConnectionResetError(10054, 'An existing connection was forcibly closed by
the remote host', None, 10054, None))
During handling of the above exception, another exception occurred:
...
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
During handling of the above exception, another exception occurred:
...
tweepy.error.TweepError: Failed to send request: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
我的代码:
for user in tweepy.Cursor(api.friends, id="twitter").items():
friendsOfUser=user.screen_name
## Do something with friendsOfUser
有什么我可以做的吗?
你不能对主机关闭连接这一事实做任何事情。如果您正在等待速率限制,我敢打赌您在使用 API 方面有点激进 :) 尝试捕获 TweepError
并明确等待一段时间,然后他们再试一次。
您可以尝试这样的操作:
import time
...
try:
for user in tweepy.Cursor(api.friends, id="twitter").items():
friendsOfUser=user.screen_name
...
except tweepy.TweepError:
time.sleep(120) # sleep for 2 minutes. You may try different time
这对我有用:
backoff_counter = 1
while True:
try:
for user in tweepy.Cursor(api.friends, id="twitter").items():
# do something with user
break
except tweepy.TweepError as e:
print(e.reason)
sleep(60*backoff_counter)
backoff_counter += 1
continue
基本上,当您遇到错误时,您会睡一会儿,然后重试。我使用增量退避来确保休眠时间足以重新建立连接。
为避免这种情况,您可以在每次请求后添加超时。我使用的脚本每 15 分钟只允许 15 个请求,所以我每分钟发出一个请求并最大化数据。
for page in tweepy.Cursor(api.followers, screen_name=user_name, wait_on_rate_limit=True, count=200).pages():
try:
followers.extend(page)
print("-->", len(followers))
if len(followers) % 100 == 0:
save_followers_to_csv(user_name, followers)
time.sleep(60)
except tweepy.TweepError as e:
print("Going to sleep:", e)
time.sleep(60)
为了避免速率限制错误我使用了参数:
wait_on_rate_limit
函数中
api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)
起初我的程序运行良好。当我超过速率限制时,我收到消息:
"Rate limit reached. Sleeping for: 909"。我的程序休眠了这段时间,然后我的程序继续收集数据。在某些时候,我遇到了一些背靠背的错误。
...
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
...
urllib3.exceptions.ProtocolError: ('Connection aborted.',
ConnectionResetError(10054, 'An existing connection was forcibly closed by
the remote host', None, 10054, None))
During handling of the above exception, another exception occurred:
...
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
During handling of the above exception, another exception occurred:
...
tweepy.error.TweepError: Failed to send request: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
我的代码:
for user in tweepy.Cursor(api.friends, id="twitter").items():
friendsOfUser=user.screen_name
## Do something with friendsOfUser
有什么我可以做的吗?
你不能对主机关闭连接这一事实做任何事情。如果您正在等待速率限制,我敢打赌您在使用 API 方面有点激进 :) 尝试捕获 TweepError
并明确等待一段时间,然后他们再试一次。
您可以尝试这样的操作:
import time
...
try:
for user in tweepy.Cursor(api.friends, id="twitter").items():
friendsOfUser=user.screen_name
...
except tweepy.TweepError:
time.sleep(120) # sleep for 2 minutes. You may try different time
这对我有用:
backoff_counter = 1
while True:
try:
for user in tweepy.Cursor(api.friends, id="twitter").items():
# do something with user
break
except tweepy.TweepError as e:
print(e.reason)
sleep(60*backoff_counter)
backoff_counter += 1
continue
基本上,当您遇到错误时,您会睡一会儿,然后重试。我使用增量退避来确保休眠时间足以重新建立连接。
为避免这种情况,您可以在每次请求后添加超时。我使用的脚本每 15 分钟只允许 15 个请求,所以我每分钟发出一个请求并最大化数据。
for page in tweepy.Cursor(api.followers, screen_name=user_name, wait_on_rate_limit=True, count=200).pages():
try:
followers.extend(page)
print("-->", len(followers))
if len(followers) % 100 == 0:
save_followers_to_csv(user_name, followers)
time.sleep(60)
except tweepy.TweepError as e:
print("Going to sleep:", e)
time.sleep(60)