扭曲的 SSL 套接字连接速度变慢
Twisted SSL socket connection slowdown
如何扩展我的 Twisted 服务器以处理数以万计的并发 SSL 套接字连接?
前几百个客户端连接速度相对较快,但随着计数接近 3000,它开始以每秒约 2 个连接的速度进行抓取。
我正在使用以下循环进行负载测试:
clients = []
for i in xrange(connections):
print i
clients.append(
ssl.wrap_socket(
socket.socket(socket.AF_INET, socket.SOCK_STREAM),
ca_certs="server.crt",
cert_reqs=ssl.CERT_REQUIRED
)
)
clients[i].connect(('localhost', 9999))
c个人资料:
296644049 function calls (296407530 primitive calls) in 3070.656 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 3070.656 3070.656 server.py:7(<module>)
1 0.000 0.000 3070.408 3070.408 server.py:148(main)
1 0.000 0.000 3070.406 3070.406 server.py:106(run)
1 0.000 0.000 3070.405 3070.405 base.py:1190(run)
1 0.047 0.047 3070.404 3070.404 base.py:1195(mainLoop)
34383 0.090 0.000 3070.263 0.089 epollreactor.py:367(doPoll)
38696 0.064 0.000 3066.883 0.079 log.py:75(callWithLogger)
38696 0.077 0.000 3066.797 0.079 log.py:70(callWithContext)
38696 0.035 0.000 3066.598 0.079 context.py:117(callWithContext)
38696 0.056 0.000 3066.556 0.079 context.py:61(callWithContext)
38695 0.093 0.000 3066.486 0.079 posixbase.py:572(_doReadOrWrite)
8599 1249.585 0.145 3019.333 0.351 protocol.py:114(getClientsDict)
37582010 1681.445 0.000 1681.445 0.000 {method 'items' of 'dict' objects}
21496 0.114 0.000 1535.798 0.071 tls.py:346(_flushReceiveBIO)
21496 0.026 0.000 1535.793 0.071 tcp.py:199(doRead)
21496 0.017 0.000 1535.718 0.071 tcp.py:218(_dataReceived)
17197 0.033 0.000 1535.701 0.089 tls.py:400(dataReceived)
8597 0.009 0.000 1531.480 0.178 policies.py:119(dataReceived)
8597 0.078 0.000 1531.471 0.178 protocol.py:65(dataReceived)
4300 0.029 0.000 1525.117 0.355 posixbase.py:242(_disconnectSelectable)
4300 0.030 0.000 1524.922 0.355 tcp.py:283(connectionLost)
4300 0.024 0.000 1524.659 0.355 tls.py:463(connectionLost)
4300 0.010 0.000 1524.492 0.355 policies.py:123(connectionLost)
4300 0.119 0.000 1524.471 0.355 protocol.py:50(connectionLost)
4299 0.027 0.000 1523.698 0.354 tcp.py:270(readConnectionLost)
4299 0.135 0.000 1520.228 0.354 protocol.py:88(handleInitialState)
74840519 31.487 0.000 44.916 0.000 __init__.py:348(__getattr__)
反应器运行代码:
def run(self):
contextFactory = ssl.DefaultOpenSSLContextFactory(self._key, self._cert)
reactor.listenSSL(self._port, BrakersFactory(), contextFactory)
reactor.run()
鉴于问题中缺少代码,我将一些代码放在一起,看看我是否体验到您所说的效果。从那个实验中,我要说的第一件事是检查并查看脚本 运行s.
时机器上的内存利用率发生了什么
我启动了一个标准的 google 云计算系统(1 个 vCPU,3.8GB 内存)(debian backports wheezy,apt-get update; apt-get install python-twisted
)和 运行 以下(可怕的 hack)代码:
(注意:为了 运行,我需要为客户端和服务器 shell 执行 ulimit -n 4096
,否则我将开始获得 'Too many open file' I.E. Socket accept - "Too many open files")
serv.py
#!/usr/bin/python
from twisted.internet import ssl, reactor
from twisted.internet.protocol import ServerFactory, Protocol
class Echo(Protocol):
def connectionMade(self):
self.factory.clients.append(self)
print "Currently %d open connections.\n" % len(self.factory.clients)
def connectionLost(self, reason):
self.factory.clients.remove(self)
print "Lost connection"
def dataReceived(self, data):
"""As soon as any data is received, write it back."""
self.transport.write(data)
class MyServerFactory(ServerFactory):
protocol = Echo
def __init__(self):
self.clients = []
if __name__ == '__main__':
factory = MyServerFactory()
reactor.listenSSL(8000, factory,
ssl.DefaultOpenSSLContextFactory(
'keys/server.key', 'keys/server.crt'))
reactor.run()
cli.py
#!/usr/bin/python
from twisted.internet import ssl, reactor
from twisted.internet.protocol import ClientFactory, Protocol
class EchoClient(Protocol):
def connectionMade(self):
print "hello, world"
# The following delay is there because as soon as the write
# happens the server will close the connection
reactor.callLater(60, self.transport.write, "hello, world!")
def dataReceived(self, data):
print "Server said:", data
self.transport.loseConnection()
class EchoClientFactory(ClientFactory):
protocol = EchoClient
def __init__(self):
self.stopping = False
def clientConnectionFailed(self, connector, reason):
print "Connection failed - reason ", reason
if not self.stopping:
self.stopping = True
reactor.callLater(10,reactor.stop)
def clientConnectionLost(self, connector, reason):
print "Connection lost - goodbye!"
if not self.stopping:
self.stopping = True
reactor.callLater(10,reactor.stop)
if __name__ == '__main__':
connections = 4000
factory = EchoClientFactory()
for i in xrange(connections):
# the following could certainly be done more elegantly, but I believe
# its a legit use, and given the list in finite, shouldn't be too
# resource intensive of a use... ?
reactor.callLater(i/float(400), reactor.connectSSL,'xx.xx.xx.xx', 8000, factory, ssl.ClientContextFactory())
reactor.run()
在 运行ning 和跨越 2544 个连接时,我的机器严重堵塞,足以让我们很难从中收集数据,但考虑到新的 ssh'es 返回 '/bin/bash: 无法分配内存',当我确实在我的 serv.py 上有 2g 的 res,而客户端有 1.4g,我想可以肯定地说我炸毁了 ram。
考虑到上面的代码只是一个快速破解,我可能有导致内存问题的突出错误 - 虽然我想我会提供这个想法,因为导致你的机器交换肯定是导致你的应用程序的好方法爬行。 (也许你和我有同样的错误)
(顺便说一句,对于那些更聪明的扭曲的人来说,我欢迎评论我做错了什么,那是在燃烧这么多的内存)
我设法确定了我的协议变慢的原因。
正如您从上面的 cProfile 中看到的,大部分时间花在了 getClientDict() 方法中:
296644049 function calls (296407530 primitive calls) in 3070.656 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
8599 1249.585 0.145 3019.333 0.351 protocol.py:114(getClientsDict)
37582010 1681.445 0.000 1681.445 0.000 {method 'items' of 'dict' objects}
以下代码导致了此问题:
def getClientsDict(self):
rc = {1: {}, 2: {}}
for r in self.factory._clients[1]:
rc[1] = dict(rc[1].items() +
{r.getDict[1]['id']:
r.getDict[1][
'address']}.items())
for m in self.factory._clients[2]:
rc[2] = dict(rc[2].items() +
{m.getDict[2]['id']:
m.getDict[2][
'address']}.items())
return rc
如何扩展我的 Twisted 服务器以处理数以万计的并发 SSL 套接字连接?
前几百个客户端连接速度相对较快,但随着计数接近 3000,它开始以每秒约 2 个连接的速度进行抓取。
我正在使用以下循环进行负载测试:
clients = []
for i in xrange(connections):
print i
clients.append(
ssl.wrap_socket(
socket.socket(socket.AF_INET, socket.SOCK_STREAM),
ca_certs="server.crt",
cert_reqs=ssl.CERT_REQUIRED
)
)
clients[i].connect(('localhost', 9999))
c个人资料:
296644049 function calls (296407530 primitive calls) in 3070.656 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 3070.656 3070.656 server.py:7(<module>)
1 0.000 0.000 3070.408 3070.408 server.py:148(main)
1 0.000 0.000 3070.406 3070.406 server.py:106(run)
1 0.000 0.000 3070.405 3070.405 base.py:1190(run)
1 0.047 0.047 3070.404 3070.404 base.py:1195(mainLoop)
34383 0.090 0.000 3070.263 0.089 epollreactor.py:367(doPoll)
38696 0.064 0.000 3066.883 0.079 log.py:75(callWithLogger)
38696 0.077 0.000 3066.797 0.079 log.py:70(callWithContext)
38696 0.035 0.000 3066.598 0.079 context.py:117(callWithContext)
38696 0.056 0.000 3066.556 0.079 context.py:61(callWithContext)
38695 0.093 0.000 3066.486 0.079 posixbase.py:572(_doReadOrWrite)
8599 1249.585 0.145 3019.333 0.351 protocol.py:114(getClientsDict)
37582010 1681.445 0.000 1681.445 0.000 {method 'items' of 'dict' objects}
21496 0.114 0.000 1535.798 0.071 tls.py:346(_flushReceiveBIO)
21496 0.026 0.000 1535.793 0.071 tcp.py:199(doRead)
21496 0.017 0.000 1535.718 0.071 tcp.py:218(_dataReceived)
17197 0.033 0.000 1535.701 0.089 tls.py:400(dataReceived)
8597 0.009 0.000 1531.480 0.178 policies.py:119(dataReceived)
8597 0.078 0.000 1531.471 0.178 protocol.py:65(dataReceived)
4300 0.029 0.000 1525.117 0.355 posixbase.py:242(_disconnectSelectable)
4300 0.030 0.000 1524.922 0.355 tcp.py:283(connectionLost)
4300 0.024 0.000 1524.659 0.355 tls.py:463(connectionLost)
4300 0.010 0.000 1524.492 0.355 policies.py:123(connectionLost)
4300 0.119 0.000 1524.471 0.355 protocol.py:50(connectionLost)
4299 0.027 0.000 1523.698 0.354 tcp.py:270(readConnectionLost)
4299 0.135 0.000 1520.228 0.354 protocol.py:88(handleInitialState)
74840519 31.487 0.000 44.916 0.000 __init__.py:348(__getattr__)
反应器运行代码:
def run(self):
contextFactory = ssl.DefaultOpenSSLContextFactory(self._key, self._cert)
reactor.listenSSL(self._port, BrakersFactory(), contextFactory)
reactor.run()
鉴于问题中缺少代码,我将一些代码放在一起,看看我是否体验到您所说的效果。从那个实验中,我要说的第一件事是检查并查看脚本 运行s.
时机器上的内存利用率发生了什么我启动了一个标准的 google 云计算系统(1 个 vCPU,3.8GB 内存)(debian backports wheezy,apt-get update; apt-get install python-twisted
)和 运行 以下(可怕的 hack)代码:
(注意:为了 运行,我需要为客户端和服务器 shell 执行 ulimit -n 4096
,否则我将开始获得 'Too many open file' I.E. Socket accept - "Too many open files")
serv.py
#!/usr/bin/python
from twisted.internet import ssl, reactor
from twisted.internet.protocol import ServerFactory, Protocol
class Echo(Protocol):
def connectionMade(self):
self.factory.clients.append(self)
print "Currently %d open connections.\n" % len(self.factory.clients)
def connectionLost(self, reason):
self.factory.clients.remove(self)
print "Lost connection"
def dataReceived(self, data):
"""As soon as any data is received, write it back."""
self.transport.write(data)
class MyServerFactory(ServerFactory):
protocol = Echo
def __init__(self):
self.clients = []
if __name__ == '__main__':
factory = MyServerFactory()
reactor.listenSSL(8000, factory,
ssl.DefaultOpenSSLContextFactory(
'keys/server.key', 'keys/server.crt'))
reactor.run()
cli.py
#!/usr/bin/python
from twisted.internet import ssl, reactor
from twisted.internet.protocol import ClientFactory, Protocol
class EchoClient(Protocol):
def connectionMade(self):
print "hello, world"
# The following delay is there because as soon as the write
# happens the server will close the connection
reactor.callLater(60, self.transport.write, "hello, world!")
def dataReceived(self, data):
print "Server said:", data
self.transport.loseConnection()
class EchoClientFactory(ClientFactory):
protocol = EchoClient
def __init__(self):
self.stopping = False
def clientConnectionFailed(self, connector, reason):
print "Connection failed - reason ", reason
if not self.stopping:
self.stopping = True
reactor.callLater(10,reactor.stop)
def clientConnectionLost(self, connector, reason):
print "Connection lost - goodbye!"
if not self.stopping:
self.stopping = True
reactor.callLater(10,reactor.stop)
if __name__ == '__main__':
connections = 4000
factory = EchoClientFactory()
for i in xrange(connections):
# the following could certainly be done more elegantly, but I believe
# its a legit use, and given the list in finite, shouldn't be too
# resource intensive of a use... ?
reactor.callLater(i/float(400), reactor.connectSSL,'xx.xx.xx.xx', 8000, factory, ssl.ClientContextFactory())
reactor.run()
在 运行ning 和跨越 2544 个连接时,我的机器严重堵塞,足以让我们很难从中收集数据,但考虑到新的 ssh'es 返回 '/bin/bash: 无法分配内存',当我确实在我的 serv.py 上有 2g 的 res,而客户端有 1.4g,我想可以肯定地说我炸毁了 ram。
考虑到上面的代码只是一个快速破解,我可能有导致内存问题的突出错误 - 虽然我想我会提供这个想法,因为导致你的机器交换肯定是导致你的应用程序的好方法爬行。 (也许你和我有同样的错误)
(顺便说一句,对于那些更聪明的扭曲的人来说,我欢迎评论我做错了什么,那是在燃烧这么多的内存)
我设法确定了我的协议变慢的原因。
正如您从上面的 cProfile 中看到的,大部分时间花在了 getClientDict() 方法中:
296644049 function calls (296407530 primitive calls) in 3070.656 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
8599 1249.585 0.145 3019.333 0.351 protocol.py:114(getClientsDict)
37582010 1681.445 0.000 1681.445 0.000 {method 'items' of 'dict' objects}
以下代码导致了此问题:
def getClientsDict(self):
rc = {1: {}, 2: {}}
for r in self.factory._clients[1]:
rc[1] = dict(rc[1].items() +
{r.getDict[1]['id']:
r.getDict[1][
'address']}.items())
for m in self.factory._clients[2]:
rc[2] = dict(rc[2].items() +
{m.getDict[2]['id']:
m.getDict[2][
'address']}.items())
return rc