编写 Twisted 客户端以将循环 GET 请求发送到多个 API 调用并记录响应
Writing a Twisted Client to send looping GET request to multiple API calls and record response
我已经有一段时间没有做过扭曲的编程了,所以我正在尝试重新开始一个新项目。我正在尝试设置一个扭曲的客户端,它可以将服务器列表作为参数,并且对于每个服务器,它发送一个 API GET 调用并将 return 消息写入一个文件。此 API GET 调用应每 60 秒重复一次。
我已经使用 Twisted 的代理在单个服务器上成功完成了 class:
from StringIO import StringIO
from twisted.internet import reactor
from twisted.internet.protocol import Protocol
from twisted.web.client import Agent
from twisted.web.http_headers import Headers
from twisted.internet.defer import Deferred
import datetime
from datetime import timedelta
import time
count = 1
filename = "test.csv"
class server_response(Protocol):
def __init__(self, finished):
print "init server response"
self.finished = finished
self.remaining = 1024 * 10
def dataReceived(self, bytes):
if self.remaining:
display = bytes[:self.remaining]
print 'Some data received:'
print display
with open(filename, "a") as myfile:
myfile.write(display)
self.remaining -= len(display)
def connectionLost(self, reason):
print 'Finished receiving body:', reason.getErrorMessage()
self.finished.callback(None)
def capture_response(response):
print "Capturing response"
finished = Deferred()
response.deliverBody(server_response(finished))
print "Done capturing:", finished
return finished
def responseFail(err):
print "error" + err
reactor.stop()
def cl(ignored):
print "sending req"
agent = Agent(reactor)
headers = {
'authorization': [<snipped>],
'cache-control': [<snipped>],
'postman-token': [<snipped>]
}
URL = <snipped>
print URL
a = agent.request(
'GET',
URL,
Headers(headers),
None)
a.addCallback(capture_response)
reactor.callLater(60, cl, None)
#a.addBoth(cbShutdown, count)
def cbShutdown(ignored, count):
print "reactor stop"
reactor.stop()
def parse_args():
usage = """usage: %prog [options] [hostname]:port ...
Run it like this:
python test.py hostname1:instanceName1 hostname2:instancename2 ...
"""
parser = optparse.OptionParser(usage)
_, addresses = parser.parse_args()
if not addresses:
print parser.format_help()
parser.exit()
def parse_address(addr):
if ':' not in addr:
hostName = '127.0.0.1'
instanceName = addr
else:
hostName, instanceName = addr.split(':', 1)
return hostName, instanceName
return map(parse_address, addresses)
if __name__ == '__main__':
d = Deferred()
d.addCallbacks(cl, responseFail)
reactor.callWhenRunning(d.callback, None)
reactor.run()
但是我很难弄清楚如何让多个代理发送呼叫。有了这个,我依靠 cl() ---reactor.callLater(60, cl, None) 中写入的结尾来创建调用循环。 那么我如何创建多个呼叫代理协议 (server_response(Protocol)) 并在我的反应堆启动后继续循环遍历每个协议的 GET?
看看猫拖进来的是什么东西!
So how do I create multiple call agent
使用treq
。你很少想纠结于 Agent
class.
This API GET call should be repeated every 60 seconds
使用 LoopingCalls
而不是 callLater
,在这种情况下它更容易,而且您以后会 运行 遇到更少的问题。
import treq
from twisted.internet import task, reactor
filename = 'test.csv'
def writeToFile(content):
with open(filename, 'ab') as f:
f.write(content)
def everyMinute(*urls):
for url in urls:
d = treq.get(url)
d.addCallback(treq.content)
d.addCallback(writeToFile)
#----- Main -----#
sites = [
'https://www.google.com',
'https://www.amazon.com',
'https://www.facebook.com']
repeating = task.LoopingCall(everyMinute, *sites)
repeating.start(60)
reactor.run()
它从 everyMinute()
函数开始,每 60 秒 运行s。在该函数中,查询每个端点,一旦响应的内容可用,treq.content
函数获取响应和 returns 内容。最后将内容写入文件。
PS
您是在抓取或试图从这些网站中提取内容吗?如果您 scrapy
可能是一个不错的选择。
我已经有一段时间没有做过扭曲的编程了,所以我正在尝试重新开始一个新项目。我正在尝试设置一个扭曲的客户端,它可以将服务器列表作为参数,并且对于每个服务器,它发送一个 API GET 调用并将 return 消息写入一个文件。此 API GET 调用应每 60 秒重复一次。
我已经使用 Twisted 的代理在单个服务器上成功完成了 class:
from StringIO import StringIO
from twisted.internet import reactor
from twisted.internet.protocol import Protocol
from twisted.web.client import Agent
from twisted.web.http_headers import Headers
from twisted.internet.defer import Deferred
import datetime
from datetime import timedelta
import time
count = 1
filename = "test.csv"
class server_response(Protocol):
def __init__(self, finished):
print "init server response"
self.finished = finished
self.remaining = 1024 * 10
def dataReceived(self, bytes):
if self.remaining:
display = bytes[:self.remaining]
print 'Some data received:'
print display
with open(filename, "a") as myfile:
myfile.write(display)
self.remaining -= len(display)
def connectionLost(self, reason):
print 'Finished receiving body:', reason.getErrorMessage()
self.finished.callback(None)
def capture_response(response):
print "Capturing response"
finished = Deferred()
response.deliverBody(server_response(finished))
print "Done capturing:", finished
return finished
def responseFail(err):
print "error" + err
reactor.stop()
def cl(ignored):
print "sending req"
agent = Agent(reactor)
headers = {
'authorization': [<snipped>],
'cache-control': [<snipped>],
'postman-token': [<snipped>]
}
URL = <snipped>
print URL
a = agent.request(
'GET',
URL,
Headers(headers),
None)
a.addCallback(capture_response)
reactor.callLater(60, cl, None)
#a.addBoth(cbShutdown, count)
def cbShutdown(ignored, count):
print "reactor stop"
reactor.stop()
def parse_args():
usage = """usage: %prog [options] [hostname]:port ...
Run it like this:
python test.py hostname1:instanceName1 hostname2:instancename2 ...
"""
parser = optparse.OptionParser(usage)
_, addresses = parser.parse_args()
if not addresses:
print parser.format_help()
parser.exit()
def parse_address(addr):
if ':' not in addr:
hostName = '127.0.0.1'
instanceName = addr
else:
hostName, instanceName = addr.split(':', 1)
return hostName, instanceName
return map(parse_address, addresses)
if __name__ == '__main__':
d = Deferred()
d.addCallbacks(cl, responseFail)
reactor.callWhenRunning(d.callback, None)
reactor.run()
但是我很难弄清楚如何让多个代理发送呼叫。有了这个,我依靠 cl() ---reactor.callLater(60, cl, None) 中写入的结尾来创建调用循环。 那么我如何创建多个呼叫代理协议 (server_response(Protocol)) 并在我的反应堆启动后继续循环遍历每个协议的 GET?
看看猫拖进来的是什么东西!
So how do I create multiple call agent
使用treq
。你很少想纠结于 Agent
class.
This API GET call should be repeated every 60 seconds
使用 LoopingCalls
而不是 callLater
,在这种情况下它更容易,而且您以后会 运行 遇到更少的问题。
import treq
from twisted.internet import task, reactor
filename = 'test.csv'
def writeToFile(content):
with open(filename, 'ab') as f:
f.write(content)
def everyMinute(*urls):
for url in urls:
d = treq.get(url)
d.addCallback(treq.content)
d.addCallback(writeToFile)
#----- Main -----#
sites = [
'https://www.google.com',
'https://www.amazon.com',
'https://www.facebook.com']
repeating = task.LoopingCall(everyMinute, *sites)
repeating.start(60)
reactor.run()
它从 everyMinute()
函数开始,每 60 秒 运行s。在该函数中,查询每个端点,一旦响应的内容可用,treq.content
函数获取响应和 returns 内容。最后将内容写入文件。
PS
您是在抓取或试图从这些网站中提取内容吗?如果您 scrapy
可能是一个不错的选择。