卡夫卡消费者接收开销?
kafka consumer receiving overhead?
我有一个 kafka 消费者,每 10 秒进行一次轮询。我正在使用 wireshark 来监控我的网络活动。
我注意到即使我没有执行任何提取请求,代理和我的消费者之间仍然存在流量。而且我还注意到,定期发送和接收的是相同的数据包(几乎相同,只是负载略有变化)。
这是某种保活数据包吗?如何减少它们?
这是这些数据包的屏幕截图:
PS:我正在使用 cppkafka 作为 lib 和 kafka broker 0.8.2.2
编辑:客户代码
bool running = true;
int main(int argc, char* argv[]) {
string brokers;
string topic_name;
string group_id;
po::options_description options("Options");
options.add_options()
("help,h", "produce this help message")
("brokers,b", po::value<string>(&brokers)->required(),
"the kafka broker list")
("topic,t", po::value<string>(&topic_name)->required(),
"the topic in which to write to")
("group-id,g", po::value<string>(&group_id)->required(),
"the consumer group id")
;
po::variables_map vm;
try {
po::store(po::command_line_parser(argc, argv).options(options).run(), vm);
po::notify(vm);
}
catch (exception& ex) {
cout << "Error parsing options: " << ex.what() << endl;
cout << endl;
cout << options << endl;
return 1;
}
// Stop processing on SIGINT
signal(SIGINT, [](int) { running = false; });
// Construct the configuration
Configuration config = {
{ "metadata.broker.list", brokers },
{ "api.version.request", false },
{ "broker.version.fallback", "0.8.2.2" },
{ "group.id", group_id },
// Disable auto commit
{ "enable.auto.commit", false }
};
// Create the consumer
Consumer consumer(config);
// Subscribe to the topic
TopicPartitionList topicList;
cppkafka::TopicPartition topPar(topic_name,0);
topPar.set_offset(0);
topicList.push_back(topPar);
cout << "Consuming messages from topic " << topic_name << endl;
consumer.assign(topicList);
// Now read lines and write them into kafka
while (running) {
// Try to consume a message
Message msg = consumer.poll();
if (msg) {
// If we managed to get a message
if (msg.get_error()) {
// Ignore EOF notifications from rdkafka
if (!msg.is_eof()) {
cout << "[+] Received error notification: " << msg.get_error() << endl;
} else {
std::this_thread::sleep_for(std::chrono::milliseconds(10000));
}
} else {
// Print the key (if any)
if (msg.get_key()) {
cout << msg.get_key() << " -> ";
}
// Print the payload
cout << msg.get_payload() << endl;
}
}
}
}
你可能会看到 heartbeat messages to keep the consumer group alive, you can find more info about them here: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-GroupMembershipAPI
可以通过修改heartbeat.interval.ms来调整心跳间隔,查看librdkafka configuration.
Cppkafka 建立在 librdkafka 之上。 Librdkafka 尝试为所有分配的分区预取消息,因此当您调用 poll() 时消息立即可用。
默认情况下,librdkafka 非常积极(旨在获得最佳性能),因此您看到的每秒 FetchRequests 很少。
有关更多详细信息,请参阅 librdkafka 的常见问题解答:
我有一个 kafka 消费者,每 10 秒进行一次轮询。我正在使用 wireshark 来监控我的网络活动。
我注意到即使我没有执行任何提取请求,代理和我的消费者之间仍然存在流量。而且我还注意到,定期发送和接收的是相同的数据包(几乎相同,只是负载略有变化)。
这是某种保活数据包吗?如何减少它们?
这是这些数据包的屏幕截图:
PS:我正在使用 cppkafka 作为 lib 和 kafka broker 0.8.2.2
编辑:客户代码
bool running = true;
int main(int argc, char* argv[]) {
string brokers;
string topic_name;
string group_id;
po::options_description options("Options");
options.add_options()
("help,h", "produce this help message")
("brokers,b", po::value<string>(&brokers)->required(),
"the kafka broker list")
("topic,t", po::value<string>(&topic_name)->required(),
"the topic in which to write to")
("group-id,g", po::value<string>(&group_id)->required(),
"the consumer group id")
;
po::variables_map vm;
try {
po::store(po::command_line_parser(argc, argv).options(options).run(), vm);
po::notify(vm);
}
catch (exception& ex) {
cout << "Error parsing options: " << ex.what() << endl;
cout << endl;
cout << options << endl;
return 1;
}
// Stop processing on SIGINT
signal(SIGINT, [](int) { running = false; });
// Construct the configuration
Configuration config = {
{ "metadata.broker.list", brokers },
{ "api.version.request", false },
{ "broker.version.fallback", "0.8.2.2" },
{ "group.id", group_id },
// Disable auto commit
{ "enable.auto.commit", false }
};
// Create the consumer
Consumer consumer(config);
// Subscribe to the topic
TopicPartitionList topicList;
cppkafka::TopicPartition topPar(topic_name,0);
topPar.set_offset(0);
topicList.push_back(topPar);
cout << "Consuming messages from topic " << topic_name << endl;
consumer.assign(topicList);
// Now read lines and write them into kafka
while (running) {
// Try to consume a message
Message msg = consumer.poll();
if (msg) {
// If we managed to get a message
if (msg.get_error()) {
// Ignore EOF notifications from rdkafka
if (!msg.is_eof()) {
cout << "[+] Received error notification: " << msg.get_error() << endl;
} else {
std::this_thread::sleep_for(std::chrono::milliseconds(10000));
}
} else {
// Print the key (if any)
if (msg.get_key()) {
cout << msg.get_key() << " -> ";
}
// Print the payload
cout << msg.get_payload() << endl;
}
}
}
}
你可能会看到 heartbeat messages to keep the consumer group alive, you can find more info about them here: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-GroupMembershipAPI
可以通过修改heartbeat.interval.ms来调整心跳间隔,查看librdkafka configuration.
Cppkafka 建立在 librdkafka 之上。 Librdkafka 尝试为所有分配的分区预取消息,因此当您调用 poll() 时消息立即可用。
默认情况下,librdkafka 非常积极(旨在获得最佳性能),因此您看到的每秒 FetchRequests 很少。
有关更多详细信息,请参阅 librdkafka 的常见问题解答: