Python SSL - 当数据大于 1400 时 recv 被偏移

Python SSL - recv being offset when data larger than 1400

我在 Python 中使用 ssl 模块,运行 遇到了一些似乎是缓冲区的小问题。

我有以下过程来处理来自套接字的数据,并且我还添加了一个 while 循环使用 pending based on this question,但它并没有解决问题。我也加大了缓冲区的大小,但无济于事。

RECV_BUFFER = 131072
def handle(client_socket):
    try:
        rxdata = client_socket.recv(RECV_BUFFER)
        if rxdata:
                print("Rx: " + rxdata.decode())
                while(client_socket.pending()):
                    rxdata = client_socket.recv(RECV_BUFFER)
                    sys.stdout.write(rxdata.decode())
    except Exception as e:
        print("Exception: " + str(e))

出于测试目的,我设置了一个用户输入,以便我可以直接进行测试。 A GET / returns "Hello World" 而 GET /other returns 是一个长字符串。每次缓冲区溢出时,returns 都会偏移 1,如下所示。

Command>GET /
Tx: GET /
Rx: HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
X-Cloud-Trace-Context: a65f614b75674fa723b7d69c1af03a0e;o=1
Date: Sun, 02 Sep 2018 16:00:19 GMT
Server: My Frontend
Content-Length: 12

Hello World!
Command>GET /other
Tx: GET /other
Rx: HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
X-Cloud-Trace-Context: 90033f7e308e07508106359c3e7c76d1
Date: Sun, 02 Sep 2018 16:00:23 GMT
Server: My Frontend
Content-Length: 1924

This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. T
Command>GET /
Tx: GET /
Rx: his is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. This is something else. End.
Command>GET /other
Tx: GET /other
Rx: HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
X-Cloud-Trace-Context: 160b0cd5f80982bf1e7ab7dd5d94996d
Date: Sun, 02 Sep 2018 16:00:26 GMT
Server: My Frontend
Content-Length: 12

Hello World!

这是怎么回事,应该如何解决?

我不完全确定您要做什么,但我认为您的服务器基本上是这样工作的:

  1. 阅读命令(一行)。
  2. 立即发送完整回复。

鉴于您使用的 pending 仅检查 SSL 套接字中是否仍有解密数据,我的猜测是您假设数据是由服务器在单个 sent 那么客户端也会立即读取它。但这种情况并非如此。这里实际发生的事情是这样的:

  1. 服务器发送了很多天,比如 20000 字节。
  2. 在 SSL 级别,这些至少是两个 SSL 记录,因为单个记录的大小只能为 16384。因此,假设它将执行 16384 的记录和其余的记录(3616 字节)。
  3. ssl_socket.revc(RECV_BUFFER) 至少会从底层 TCP 连接读取与需要完整 SSL 记录一样多的数据。然后它将解密 SSL 记录和 return 最多 RECV_BUFFER 字节的解密数据。
  4. ssl_socket.pending()会告诉你SSL套接字中是否还有未读的解密数据。它不会检查底层 TCP 套接字是否有可用数据。如果 SSL 套接字中仍有数据,下一个 ssl_socket.recv(...) 将从这些数据中 return 但不会尝试从底层 TCP 套接字读取更多数据。仅当 SSL 套接字中没有更多解密但未读数据可用时,recv 将从底层 TCP 套接字读取更多内容 - 但在这种情况下 pending 将 return false 所以你会永远不要尝试读取更多数据。

这意味着在您的 recv 中可能只读取和解密并 return 编辑了第一个 SSL 记录。因此,如果您发送下一个命令,您将不会收到新的响应,但您实际上会从上一个请求中读取剩余的响应数据。

为了修正代码,您需要修正您的假设:SSL 需要被视为数据流而不是消息协议(TCP 也是如此)。这意味着您不能假设消息已被完整读取,并且它将被完整 returned 或它至少已经在 SSL 对象中被完整读取。相反,您要么需要预先知道响应的大小(比如在响应前加上一个长度前缀),要么需要有一些明确的标记表明响应已经结束并一直读到这个标记。

这是我最终确定的解决方案。我觉得这是比之前发布的解决方案更正确的解决方案。它还具有通过将 True 指定为第二个参数来剥离 headers 或保留原位的选项:

def handle(client_socket, raw=False):
    data = client_socket.recv()
    reCL = re.search('Content-Length: (\d+)', data.decode(), re.MULTILINE)
    contentLength = int(reCL.group(1))
    contentLengthEndChar = reCL.end()+4
    dataSize = contentLength 
    if raw == True: dataSize += contentLengthEndChar
    sslRecordPending = math.ceil(dataSize / 16384) - 1 #SSL records left; not used
    socket_active = True
    rxdata = b''
    if raw == True: rxdata = data[:contentLengthEndChar]
    rxdata += data[contentLengthEndChar:]
    while True:
        try:
            if len(rxdata) == dataSize: break
            rxdata += client_socket.recv()
        except socket.timeout:
            break
    return rxdata.decode()