Apache HttpClient 连接配置
Apache HttpClient connection configuration
我正在尝试设置一个 HttpClient through the HttpClientBuilder. I also had a look at the HttpClientConnectionManager,这里开始出现混乱。
在 ConnectionManager 或更确切地说 PoolingHttpClientConnectionManager 上,有方法可以:
- 关闭过期的连接
- 关闭空闲连接
连接何时被视为过期?
什么时候空闲?
当池中的连接关闭时会发生什么?是否确保在需要时重新创建连接?
根据:https://hc.apache.org/httpcomponents-client-4.5.x/current/tutorial/html/connmgmt.html#d5e418
HttpClient tries to mitigate the problem by testing whether the
connection is 'stale', that is no longer valid because it was closed
on the server side, prior to using the connection for executing an
HTTP request. The stale connection check is not 100% reliable. The
only feasible solution that does not involve a one thread per socket
model for idle connections is a dedicated monitor thread used to evict
connections that are considered expired due to a long period of
inactivity. The monitor thread can periodically call
ClientConnectionManager#closeExpiredConnections() method to close all
expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.
expired和idle的区别是过期的连接在服务器端已经关闭,而idle的连接不一定在服务器端关闭,而是闲置了一段时间。当一个连接关闭时,它在池中再次可用以供使用。
HTTP 基于 TCP,它管理以正确的顺序发送和接收包,如果包在中途丢失则请求重传。 TCP 连接以包含 SYN
、SYN-ACK
和 ACK
消息的 TCP-Handshake 开始,同时以 FIN
、ACK-FIN
结束和 ACK
系列,从这张取自 Wikipedia
的图片可以看出
虽然 HTTP 是一种 request-response 协议,但打开和关闭连接的成本很高,因此 HTTP/1.1 允许重用现有连接。使用 header Connection: keep-alive
即您告诉您的客户端(即浏览器)保持与服务器的连接打开。一个服务器可以同时有成千上万个打开的连接。为了避免耗尽服务器的资源,连接通常是及时限制的。通过套接字超时,空闲连接或存在某些连接问题(互联网访问中断等)的连接会在服务器自动关闭一些预定义的时间后关闭。
大量 HTTP 实现,例如 Apache HTTP 客户端 4.4 及更高版本,仅在连接即将使用时检查连接状态。
The handling of stale connections was changed in version 4.4. Previously, the code would check every connection by default before re-using it. The code now only checks the connection if the elapsed time since the last use of the connection exceeds the timeout that has been set. The default timeout is set to 2000ms (Source)
如果一个连接可能有一段时间没有被使用,客户端可能没有从服务器读取 ACK-FIN
,因此当它实际上已经被服务器关闭时仍然认为连接是打开的过去。这样的连接已过期,通常称为half-closed。因此,它可能会被池收集。
请注意,如果您发送包含 Connection: close
HTTP header 的请求,则应在客户端收到响应后立即关闭连接。
打开连接的状态可以通过netstat
检查,大多数现代操作系统都应该有。我最近不得不检查我们的一个 HTTP 客户端,该客户端是通过第三方库管理的,该客户端没有正确传播 Connection: Close
header,因此导致大量 half-closed 连接。
我正在尝试设置一个 HttpClient through the HttpClientBuilder. I also had a look at the HttpClientConnectionManager,这里开始出现混乱。
在 ConnectionManager 或更确切地说 PoolingHttpClientConnectionManager 上,有方法可以:
- 关闭过期的连接
- 关闭空闲连接
连接何时被视为过期?
什么时候空闲?
当池中的连接关闭时会发生什么?是否确保在需要时重新创建连接?
根据:https://hc.apache.org/httpcomponents-client-4.5.x/current/tutorial/html/connmgmt.html#d5e418
HttpClient tries to mitigate the problem by testing whether the connection is 'stale', that is no longer valid because it was closed on the server side, prior to using the connection for executing an HTTP request. The stale connection check is not 100% reliable. The only feasible solution that does not involve a one thread per socket model for idle connections is a dedicated monitor thread used to evict connections that are considered expired due to a long period of inactivity. The monitor thread can periodically call ClientConnectionManager#closeExpiredConnections() method to close all expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.
expired和idle的区别是过期的连接在服务器端已经关闭,而idle的连接不一定在服务器端关闭,而是闲置了一段时间。当一个连接关闭时,它在池中再次可用以供使用。
HTTP 基于 TCP,它管理以正确的顺序发送和接收包,如果包在中途丢失则请求重传。 TCP 连接以包含 SYN
、SYN-ACK
和 ACK
消息的 TCP-Handshake 开始,同时以 FIN
、ACK-FIN
结束和 ACK
系列,从这张取自 Wikipedia
虽然 HTTP 是一种 request-response 协议,但打开和关闭连接的成本很高,因此 HTTP/1.1 允许重用现有连接。使用 header Connection: keep-alive
即您告诉您的客户端(即浏览器)保持与服务器的连接打开。一个服务器可以同时有成千上万个打开的连接。为了避免耗尽服务器的资源,连接通常是及时限制的。通过套接字超时,空闲连接或存在某些连接问题(互联网访问中断等)的连接会在服务器自动关闭一些预定义的时间后关闭。
大量 HTTP 实现,例如 Apache HTTP 客户端 4.4 及更高版本,仅在连接即将使用时检查连接状态。
The handling of stale connections was changed in version 4.4. Previously, the code would check every connection by default before re-using it. The code now only checks the connection if the elapsed time since the last use of the connection exceeds the timeout that has been set. The default timeout is set to 2000ms (Source)
如果一个连接可能有一段时间没有被使用,客户端可能没有从服务器读取 ACK-FIN
,因此当它实际上已经被服务器关闭时仍然认为连接是打开的过去。这样的连接已过期,通常称为half-closed。因此,它可能会被池收集。
请注意,如果您发送包含 Connection: close
HTTP header 的请求,则应在客户端收到响应后立即关闭连接。
打开连接的状态可以通过netstat
检查,大多数现代操作系统都应该有。我最近不得不检查我们的一个 HTTP 客户端,该客户端是通过第三方库管理的,该客户端没有正确传播 Connection: Close
header,因此导致大量 half-closed 连接。