扭曲的 HTTPS 客户端

Twisted HTTPS Client

我目前在使用扭曲的 python 库访问通过 https 托管的内容时遇到一些问题。我是这个库的新手,假设我遗漏了一些导致问题的概念,但可能不是基于示例。

这是我收集示例的页面的 link: https://twistedmatrix.com/documents/current/web/howto/client.html

在标题下 HTTP over SSL

from twisted.python.log import err
from twisted.web.client import Agent
from twisted.internet import reactor
from twisted.internet.ssl import optionsForClientTLS

def display(response):
    print("Received response")
    print(response)

def main():
    contextFactory = optionsForClientTLS(u"https://example.com/")
    agent = Agent(reactor, contextFactory)
    d = agent.request("GET", "https://example.com/")
    d.addCallbacks(display, err)
    d.addCallback(lambda ignored: reactor.stop())
    reactor.run()

if __name__ == "__main__":
    main()

当运行这段代码时,它直接失败了。我收到如下所示的错误:

Traceback (most recent call last):
  File "https.py", line 19, in <module>
    main()
  File "https.py", line 11, in main
    contextFactory = optionsForClientTLS(u"https://example.com/")
  File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 1336, in optionsForClientTLS
    return ClientTLSOptions(hostname, certificateOptions.getContext())
  File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 1198, in __init__
    self._hostnameBytes = _idnaBytes(hostname)
  File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 86, in _idnaBytes
    return idna.encode(text)
  File "/usr/local/lib/python2.7/dist-packages/idna/core.py", line 355, in encode
    result.append(alabel(label))
  File "/usr/local/lib/python2.7/dist-packages/idna/core.py", line 276, in alabel
    check_label(label)
  File "/usr/local/lib/python2.7/dist-packages/idna/core.py", line 253, in check_label
    raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
idna.core.InvalidCodepoint: Codepoint U+003A at position 6 of u'https://example' not allowed

这个错误让我相信传递给 optionsForClientTLS 的参数不正确。它需要一个主机名而不是完整的 url,因此我将参数缩短为简单的 example.com。进行更改后,该功能成功完成。

不幸的是,进行更改后,脚本现在在调用 agent.request 的行失败。它提供的错误是这样的:

Traceback (most recent call last):
  File "https.py", line 19, in <module>
    main()
  File "https.py", line 13, in main
    d = agent.request("GET", "https://example.com/")
  File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 1596, in request
    endpoint = self._getEndpoint(parsedURI)
  File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 1580, in _getEndpoint
    return self._endpointFactory.endpointForURI(uri)
  File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 1456, in endpointForURI
    uri.port)
  File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 982, in creatorForNetloc
    context = self._webContextFactory.getContext(hostname, port)
AttributeError: 'ClientTLSOptions' object has no attribute 'getContext'

此错误使我相信 optionsForClientTLS 生成的 object 不是预期在创建时传递到代理中的 object 类型。试图调用不存在的函数。综上所述,我有两个问题。

  1. 此示例是否已弃用?前面发出 http 请求的示例都非常有效。我做错了什么,还是这个例子不再有效?
  2. 我只是在寻找一种使用 HTTPS 从服务器检索数据的简单方法。如果以这种方式做事不是解决方案,是否有人熟悉如何使用 twisted 发出 HTTPS 请求?

是的,您完全正确,文档中的示例是错误的。我注意到错误 while working w/ treq. Try following this example from v14. With that being said, you should use treq 而不是尝试直接使用 Twisted。大部分繁重的工作都已为您完成。这是您示例的简单转换:

from __future__ import print_function
import treq
from twisted.internet import defer, task
from twisted.python.log import err

@defer.inlineCallbacks
def display(response):
    content = yield treq.content(response)
    print('Content: {0}'.format(content))

def main(reactor):
    d = treq.get('https://twistedmatrix.com')
    d.addCallback(display)
    d.addErrback(err)
    return d

task.react(main)

如您所见,treq 会为您处理 SSL 相关事宜。 display()回调函数可以用来提取HTTP响应的各种组件,比如headers、状态码、body等,如果只需要一个组件,比如响应body,那么可以进一步简化像这样:

def main(reactor):
    d = treq.get('https://twistedmatrix.com')
    d.addCallback(treq.content)     # get response content when available
    d.addErrback(err)
    d.addCallback(print)
    return d

task.react(main)