谁在 Apache Kafka 中跟踪消费者的最后读取消息偏移量?

Who keeps track of the last read message offset of the consumer in Apache Kafka?

在 Apache Kafka 中,谁来跟踪消费者读取的最后一条消息?还有谁跟踪从哪个分区读取哪个消费者组 ID?所有这些信息都在动物园管理员中吗?

每个消费者组都维护每个主题分区的偏移量。由于 v0.9 每个消费者组的提交偏移量信息存储在这个内部主题中(在 v0.9 之前,此信息存储在 Zookeeper 中)。当偏移量管理器收到 OffsetCommitRequest 时,它将请求附加到名为 __consumer_offsets 的特殊压缩 Kafka 主题。最后,只有当偏移量主题的所有副本都收到偏移量时,偏移量管理器才会向消费者发送成功的偏移量提交响应。


关于您关于 分区分配 的问题,Kafka 使用 partition.assignment.strategy to determine how partitions are assigned to consumers. This propertydefaults to RangeAssignor:

The range assignor works on a per-topic basis. For each topic, we lay out the available partitions in numeric order and the consumers in lexicographic order. We then divide the number of partitions by the total number of consumers to determine the number of partitions to assign to each consumer. If it does not evenly divide, then the first few consumers will have one extra partition. For example, suppose there are two consumers C0 and C1, two topics t0 and t1, and each topic has 3 partitions, resulting in partitions t0p0, t0p1, t0p2, t1p0, t1p1, and t1p2. The assignment will be: C0: [t0p0, t0p1, t1p0, t1p1] C1: [t0p2, t1p2]

另外两个选项是RoundRobinAssignor and StickyAssignor