多个消费者从单个kafka分区消费
Consuming from single kafka partition by multiple consumers
我在 kafka docs 中阅读了以下内容:
- The way consumption is implemented in Kafka is by dividing up the partitions in the log over the consumer instances so that each instance is the exclusive consumer of a "fair share" of partitions at any point in time.
- Kafka only provides a total order over records within a partition, not between different partitions in a topic.
- Per-partition ordering combined with the ability to partition data by key is sufficient for most applications.
- However, if you require a total order over records this can be achieved with a topic that has only one partition, though this will mean only one consumer process per consumer group.
我阅读了以下内容on this page:
- Consumers read from any single partition, allowing you to scale throughput of message consumption in a similar fashion to message production.
- Consumers can also be organized into consumer groups for a given topic — each consumer within the group reads from a unique partition and the group as a whole consumes all messages from the entire topic.
- If you have more consumers than partitions then some consumers will be idle because they have no partitions to read from.
- If you have more partitions than consumers then consumers will receive messages from multiple partitions.
- If you have equal numbers of consumers and partitions, each consumer reads messages in order from exactly one partition.
疑惑
这是否意味着单个分区不能被多个消费者消费?我们不能有一个分区和一个有多个消费者的消费者组,让他们都从一个分区消费吗?
如果单个分区只能被单个消费者消费,我在想为什么要这样设计?
如果我需要对记录进行总排序并且仍然需要并行使用它怎么办?它在卡夫卡是不可撤销的吗?或者这样的场景没有意义?
在一个消费者组内,任何时候一个分区只能被一个消费者消费。不,您不能让同一组中的 2 个消费者同时从同一分区消费。
Kafka 消费者组允许多个消费者 "sort of" 表现得像一个实体。整个组应该只消费一次消息。如果一组中的多个消费者要消费相同的分区,这些记录将被处理多次。
如果需要多次消费一个分区,请确保这些消费者在不同的组中。
当处理需要在任何时候按顺序(连续)进行时,只有一个任务要做。如果您有记录 1、2 和 3 并希望按顺序处理它们,则在处理完消息 1 之前您无法执行任何操作。消息2和消息3也是一样的,那么你想并行做什么?
我在 kafka docs 中阅读了以下内容:
- The way consumption is implemented in Kafka is by dividing up the partitions in the log over the consumer instances so that each instance is the exclusive consumer of a "fair share" of partitions at any point in time.
- Kafka only provides a total order over records within a partition, not between different partitions in a topic.
- Per-partition ordering combined with the ability to partition data by key is sufficient for most applications.
- However, if you require a total order over records this can be achieved with a topic that has only one partition, though this will mean only one consumer process per consumer group.
我阅读了以下内容on this page:
- Consumers read from any single partition, allowing you to scale throughput of message consumption in a similar fashion to message production.
- Consumers can also be organized into consumer groups for a given topic — each consumer within the group reads from a unique partition and the group as a whole consumes all messages from the entire topic.
- If you have more consumers than partitions then some consumers will be idle because they have no partitions to read from.
- If you have more partitions than consumers then consumers will receive messages from multiple partitions.
- If you have equal numbers of consumers and partitions, each consumer reads messages in order from exactly one partition.
疑惑
这是否意味着单个分区不能被多个消费者消费?我们不能有一个分区和一个有多个消费者的消费者组,让他们都从一个分区消费吗?
如果单个分区只能被单个消费者消费,我在想为什么要这样设计?
如果我需要对记录进行总排序并且仍然需要并行使用它怎么办?它在卡夫卡是不可撤销的吗?或者这样的场景没有意义?
在一个消费者组内,任何时候一个分区只能被一个消费者消费。不,您不能让同一组中的 2 个消费者同时从同一分区消费。
Kafka 消费者组允许多个消费者 "sort of" 表现得像一个实体。整个组应该只消费一次消息。如果一组中的多个消费者要消费相同的分区,这些记录将被处理多次。
如果需要多次消费一个分区,请确保这些消费者在不同的组中。
当处理需要在任何时候按顺序(连续)进行时,只有一个任务要做。如果您有记录 1、2 和 3 并希望按顺序处理它们,则在处理完消息 1 之前您无法执行任何操作。消息2和消息3也是一样的,那么你想并行做什么?