Rabbitmq 高负载:Socket.error [Errno 104] 连接被对等重置
Rabbitmq on high load: Socket.error [Errno 104] Connection reset by peer
我一直在使用 celery 和 rabbitmq 作为后端。每当我向 rabbitmq 发送高负载(大约 600-1000)任务时,我都会收到以下错误
socket.error [Errno 104] 连接被对等方重置
我一直在使用的示例命令是:
for i in {1..500}; do python client.py queue_name time_out bash -c "sleep 20 && touch folder/$i" & done
for i in {1..500}; do python client.py different_queue_name time_out bash -c "sleep 20 && touch folder/$i" & done
此处 client.py 发送一个任务,该任务在 worker 上执行给定的 bash 命令并轮询 time_out 秒的结果。
我还尝试使用此命令在一段时间内发送负载。它仍然给出相同的错误
for i in {1..10}; do for i in {1..50}; do python client.py queue_name time_out bash -c "sleep 60 && touch folder/$i" & done; sleep 10; done
for i in {1..10}; do for i in {1..50}; do python client.py different_queue_name time_out bash -c "sleep 60 && touch folder/$i" & done; sleep 10; done
是什么导致了这种行为,我该如何处理这种情况?
=WARNING REPORT== file descriptor limit alarm set.
表示您已达到文件描述符限制。
您应该调整 O.S。和 RabbitMQ。
这里有一些link你应该关注:
Open File Handles Limit Operating systems limit maximum number of
concurrently open file handles, which includes network sockets. Make
sure that you have limits set high enough to allow for expected number
of concurrent connections and queues.
Make sure your environment allows for at least 50K open file
descriptors for effective RabbitMQ user, including in development
environments.
As a rule of thumb, multiple the 95th percentile number of concurrent
connections by 2 and add total number of queues to calculate
recommended open file handle limit. Values as high as 500K are not
inadequate and won't consume a lot of hardware resources, and
therefore are recommended for production setups. See Networking guide
for more information.
Erlang VM I/O Thread Pool Erlang runtime uses a pool of threads for
performing I/O operations asynchronously. The size of the pool is
configured via the +A VM command line flag, e.g. +A 128. We highly
recommend overriding the flag using the
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS
environment variable:
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+A 128" Default value is 30.
Nodes that have 8 or more cores available are recommended to use
values higher than 96, that is, 12 or more I/O threads for every core
available. Note that higher values do not necessarily mean better
throughput or lower CPU burn due to waiting on I/O. Tuning for a Large
Number of Connections
Some workloads, often referred to as "the Internet of Things", assume
a large number of client connections per node, and a relatively low
volume of traffic from each node. One such workload is sensor
networks: there can be hundreds of thousands or millions of sensors
deployed, each emitting data every several minutes. Optimising for the
maximum number of concurrent clients can be more important than for
total throughput.
Several factors can limit how many concurrent connections a single
node can support:
Number of open file handles (including sockets) Amount of RAM used by
each connection Amount of CPU resources used by each connection
希望对您有所帮助
我一直在使用 celery 和 rabbitmq 作为后端。每当我向 rabbitmq 发送高负载(大约 600-1000)任务时,我都会收到以下错误 socket.error [Errno 104] 连接被对等方重置
我一直在使用的示例命令是:
for i in {1..500}; do python client.py queue_name time_out bash -c "sleep 20 && touch folder/$i" & done
for i in {1..500}; do python client.py different_queue_name time_out bash -c "sleep 20 && touch folder/$i" & done
此处 client.py 发送一个任务,该任务在 worker 上执行给定的 bash 命令并轮询 time_out 秒的结果。
我还尝试使用此命令在一段时间内发送负载。它仍然给出相同的错误
for i in {1..10}; do for i in {1..50}; do python client.py queue_name time_out bash -c "sleep 60 && touch folder/$i" & done; sleep 10; done
for i in {1..10}; do for i in {1..50}; do python client.py different_queue_name time_out bash -c "sleep 60 && touch folder/$i" & done; sleep 10; done
是什么导致了这种行为,我该如何处理这种情况?
=WARNING REPORT== file descriptor limit alarm set.
表示您已达到文件描述符限制。
您应该调整 O.S。和 RabbitMQ。
这里有一些link你应该关注:
Open File Handles Limit Operating systems limit maximum number of concurrently open file handles, which includes network sockets. Make sure that you have limits set high enough to allow for expected number of concurrent connections and queues.
Make sure your environment allows for at least 50K open file descriptors for effective RabbitMQ user, including in development environments.
As a rule of thumb, multiple the 95th percentile number of concurrent connections by 2 and add total number of queues to calculate recommended open file handle limit. Values as high as 500K are not inadequate and won't consume a lot of hardware resources, and therefore are recommended for production setups. See Networking guide for more information.
Erlang VM I/O Thread Pool Erlang runtime uses a pool of threads for performing I/O operations asynchronously. The size of the pool is configured via the +A VM command line flag, e.g. +A 128. We highly recommend overriding the flag using the
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS
environment variable:RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+A 128" Default value is 30. Nodes that have 8 or more cores available are recommended to use values higher than 96, that is, 12 or more I/O threads for every core available. Note that higher values do not necessarily mean better throughput or lower CPU burn due to waiting on I/O. Tuning for a Large Number of Connections
Some workloads, often referred to as "the Internet of Things", assume a large number of client connections per node, and a relatively low volume of traffic from each node. One such workload is sensor networks: there can be hundreds of thousands or millions of sensors deployed, each emitting data every several minutes. Optimising for the maximum number of concurrent clients can be more important than for total throughput.
Several factors can limit how many concurrent connections a single node can support:
Number of open file handles (including sockets) Amount of RAM used by each connection Amount of CPU resources used by each connection
希望对您有所帮助