卡夫卡消费者接收开销?

kafka consumer receiving overhead?

我有一个 kafka 消费者,每 10 秒进行一次轮询。我正在使用 wireshark 来监控我的网络活动。

我注意到即使我没有执行任何提取请求,代理和我的消费者之间仍然存在流量。而且我还注意到,定期发送和接收的是相同的数据包(几乎相同,只是负载略有变化)。

这是某种保活数据包吗?如何减少它们?

这是这些数据包的屏幕截图:

PS:我正在使用 cppkafka 作为 lib 和 kafka broker 0.8.2.2

编辑:客户代码

bool running = true;

int main(int argc, char* argv[]) {
    string brokers;
    string topic_name;
    string group_id;

    po::options_description options("Options");
    options.add_options()
        ("help,h",     "produce this help message")
        ("brokers,b",  po::value<string>(&brokers)->required(), 
                       "the kafka broker list")
        ("topic,t",    po::value<string>(&topic_name)->required(),
                       "the topic in which to write to")
        ("group-id,g", po::value<string>(&group_id)->required(),
                       "the consumer group id")
        ;

    po::variables_map vm;

    try {
        po::store(po::command_line_parser(argc, argv).options(options).run(), vm);
        po::notify(vm);
    }
    catch (exception& ex) {
        cout << "Error parsing options: " << ex.what() << endl;
        cout << endl;
        cout << options << endl;
        return 1;
    }

    // Stop processing on SIGINT
    signal(SIGINT, [](int) { running = false; });

    // Construct the configuration
    Configuration config = {
        { "metadata.broker.list", brokers },
        { "api.version.request", false },
        { "broker.version.fallback", "0.8.2.2" },   
        { "group.id", group_id },
        // Disable auto commit
        { "enable.auto.commit", false }
    };

    // Create the consumer
    Consumer consumer(config);

    // Subscribe to the topic
    TopicPartitionList topicList;
    cppkafka::TopicPartition topPar(topic_name,0);
    topPar.set_offset(0);
    topicList.push_back(topPar);
    cout << "Consuming messages from topic " << topic_name << endl;

    consumer.assign(topicList);

    // Now read lines and write them into kafka
    while (running) {
        // Try to consume a message
        Message msg = consumer.poll();
        if (msg) {
            // If we managed to get a message
            if (msg.get_error()) {
                // Ignore EOF notifications from rdkafka
                if (!msg.is_eof()) {
                    cout << "[+] Received error notification: " << msg.get_error() << endl;
                } else {
                    std::this_thread::sleep_for(std::chrono::milliseconds(10000));
                }
            } else {
                // Print the key (if any)
                if (msg.get_key()) {
                    cout << msg.get_key() << " -> ";
                }
                // Print the payload
                cout << msg.get_payload() << endl;
            }
        }
    }
}

你可能会看到 heartbeat messages to keep the consumer group alive, you can find more info about them here: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-GroupMembershipAPI

可以通过修改heartbeat.interval.ms来调整心跳间隔,查看librdkafka configuration.

Cppkafka 建立在 librdkafka 之上。 Librdkafka 尝试为所有分配的分区预取消息,因此当您调用 poll() 时消息立即可用。

默认情况下,librdkafka 非常积极(旨在获得最佳性能),因此您看到的每秒 FetchRequests 很少。

有关更多详细信息,请参阅 librdkafka 的常见问题解答: