如何将批量 URL 发送到 grequest?
How to send batches of URLs into grequest?
我有一个包含 ~300K API 个 URL 的列表,我想调用这些 URL 并从中获取数据:
lst = ['url.com','url2.com']
如果我将我的列表子集化为 5 个网址 grequest
可以完美地处理请求。但是,当我传入完整的 ~300K URL 时,出现错误:
Problem: url.Iam.passing.in: HTTPSConnectionPool(host='url', port=xxx): Max retries exceeded with url: url.Iam.passing.in (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x552b17550>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
Traceback (most recent call last):
到目前为止进行异步调用的代码:
class Test:
def __init__(self):
self.urls = lst
def exception(self, request, exception):
print ("Problem: {}: {}".format(request.url, exception))
def async(self):
return grequests.map((grequests.get(u, stream=False) for u in self.urls), exception_handler=self.exception, size=5)
def collate_responses(self, results):
return [x.text for x in results]
test = Test()
#here we collect the results returned by the async function
results = test.async()
response_text = test.collate_responses(results)
当我已经通过 stream=False
时,我不确定自己做错了什么。
有什么方法可以批量传递我的列表吗?
尝试以下几行:
def async(x):
#.....do something here.....#
#return grequests.map((grequests.get(x, stream=False)), exception_handler=self.exception, size=5)
for url in url_list:
result = async(url)
time.sleep(5) #This will add a 5 second delay
我有一个包含 ~300K API 个 URL 的列表,我想调用这些 URL 并从中获取数据:
lst = ['url.com','url2.com']
如果我将我的列表子集化为 5 个网址 grequest
可以完美地处理请求。但是,当我传入完整的 ~300K URL 时,出现错误:
Problem: url.Iam.passing.in: HTTPSConnectionPool(host='url', port=xxx): Max retries exceeded with url: url.Iam.passing.in (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x552b17550>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
Traceback (most recent call last):
到目前为止进行异步调用的代码:
class Test:
def __init__(self):
self.urls = lst
def exception(self, request, exception):
print ("Problem: {}: {}".format(request.url, exception))
def async(self):
return grequests.map((grequests.get(u, stream=False) for u in self.urls), exception_handler=self.exception, size=5)
def collate_responses(self, results):
return [x.text for x in results]
test = Test()
#here we collect the results returned by the async function
results = test.async()
response_text = test.collate_responses(results)
当我已经通过 stream=False
时,我不确定自己做错了什么。
有什么方法可以批量传递我的列表吗?
尝试以下几行:
def async(x):
#.....do something here.....#
#return grequests.map((grequests.get(x, stream=False)), exception_handler=self.exception, size=5)
for url in url_list:
result = async(url)
time.sleep(5) #This will add a 5 second delay