Apache HttpClient 连接配置

Apache HttpClient connection configuration

我正在尝试设置一个 HttpClient through the HttpClientBuilder. I also had a look at the HttpClientConnectionManager,这里开始出现混乱。

在 ConnectionManager 或更确切地说 PoolingHttpClientConnectionManager 上,有方法可以:

连接何时被视为过期?
什么时候空闲?
当池中的连接关闭时会发生什么?是否确保在需要时重新创建连接?

根据:https://hc.apache.org/httpcomponents-client-4.5.x/current/tutorial/html/connmgmt.html#d5e418

HttpClient tries to mitigate the problem by testing whether the connection is 'stale', that is no longer valid because it was closed on the server side, prior to using the connection for executing an HTTP request. The stale connection check is not 100% reliable. The only feasible solution that does not involve a one thread per socket model for idle connections is a dedicated monitor thread used to evict connections that are considered expired due to a long period of inactivity. The monitor thread can periodically call ClientConnectionManager#closeExpiredConnections() method to close all expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.

expired和idle的区别是过期的连接在服务器端已经关闭,而idle的连接不一定在服务器端关闭,而是闲置了一段时间。当一个连接关闭时,它在池中再次可用以供使用。

HTTP 基于 TCP,它管理以正确的顺序发送和接收包,如果包在中途丢失则请求重传。 TCP 连接以包含 SYNSYN-ACKACK 消息的 TCP-Handshake 开始,同时以 FINACK-FIN 结束和 ACK 系列,从这张取自 Wikipedia

的图片可以看出

虽然 HTTP 是一种 request-response 协议,但打开和关闭连接的成本很高,因此 HTTP/1.1 允许重用现有连接。使用 header Connection: keep-alive 即您告诉您的客户端(即浏览器)保持与服务器的连接打开。一个服务器可以同时有成千上万个打开的连接。为了避免耗尽服务器的资源,连接通常是及时限制的。通过套接字超时,空闲连接或存在某些连接问题(互联网访问中断等)的连接会在服务器自动关闭一些预定义的时间后关闭。

大量 HTTP 实现,例如 Apache HTTP 客户端 4.4 及更高版本,仅在连接即将使用时检查连接状态。

The handling of stale connections was changed in version 4.4. Previously, the code would check every connection by default before re-using it. The code now only checks the connection if the elapsed time since the last use of the connection exceeds the timeout that has been set. The default timeout is set to 2000ms (Source)

如果一个连接可能有一段时间没有被使用,客户端可能没有从服务器读取 ACK-FIN,因此当它实际上已经被服务器关闭时仍然认为连接是打开的过去。这样的连接已过期,通常称为half-closed。因此,它可能会被池收集。

请注意,如果您发送包含 Connection: close HTTP header 的请求,则应在客户端收到响应后立即关闭连接。

打开连接的状态可以通过netstat检查,大多数现代操作系统都应该有。我最近不得不检查我们的一个 HTTP 客户端,该客户端是通过第三方库管理的,该客户端没有正确传播 Connection: Close header,因此导致大量 half-closed 连接。