kafka是否区分消耗的偏移量和承诺的偏移量?

Does kafka distinguish between consumed offset and commited offset?

据我了解,消费者会阅读特定主题的消息,并且消费者客户端会定期提交偏移量。

因此,如果由于某种原因消费者未能通过特定消息,则不会提交该偏移量,然后您可以返回并重新处理他的消息。

是否有任何东西可以跟踪您刚刚消耗的偏移量以及您随后提交的偏移量?

Does kafka distinguish between consumed offset and commited offset?

是的,有很大的不同。 consumed 偏移由消费者管理,消费者将从主题分区中获取后续消息。

消费者可以(但不是必须)自动或通过调用提交来提交消息 API。该信息存储在名为 __consumer_offsets 的 Kafka 内部主题中,并存储基于 ConsumerGroup、Topic 和 Partition 的已提交偏移量。如果客户端正在重新启动或新的消费者 joins/leaves 消费者组,将使用它。

请记住,如果您的客户端没有提交偏移量 n 但后来提交了偏移量 n+1,对于 Kafka 来说,这与您提交两个偏移量的情况没有什么不同。


编辑:有关 consumedcommitted 偏移量的更多详细信息,请参阅 Offsets and Consumer Position 上 KafkaConsumer 的 JavaDocs:

Kafka maintains a numerical offset for each record in a partition. This offset acts as a unique identifier of a record within that partition, and also denotes the position of the consumer in the partition. For example, a consumer which is at position 5 has consumed records with offsets 0 through 4 and will next receive the record with offset 5. There are actually two notions of position relevant to the user of the consumer:

The position of the consumer gives the offset of the next record that will be given out. It will be one larger than the highest offset the consumer has seen in that partition. It automatically advances every time the consumer receives messages in a call to poll(Duration).

The committed position is the last offset that has been stored securely. Should the process fail and restart, this is the offset that the consumer will recover to. The consumer can either automatically commit offsets periodically; or it can choose to control this committed position manually by calling one of the commit APIs (e.g. commitSync and commitAsync).

This distinction gives the consumer control over when a record is considered consumed. It is discussed in further detail below.