Kafka 中主题的分区和可用分区有什么区别？

Question

我打算为 Kafka Producer 编写自己的 Partitioner，所以我看到了 Kafka 的 DefaultPartitioner.

的实现

我看到它调用Cluster的availablePartitionsForTopic，有时调用partitionsForTopic进行计算分区。

我阅读了文档并查看了源代码，但我看不出两者之间有什么区别。

有人可以为我指出正确的文档或解释其中的区别吗？

Answer 1

如果你为一条记录指定了一个键，Kafka 可能认为你肯定希望将这条记录发送到某个确定的分区，即使当时它不可用。

但是，如果没有指定键，那么 Kafka 可能认为您不关心记录去往的目标分区，因此它会从 "alive" 个分区中随机选择一个。

Answer 2

要回答 difference between partitionsForTopicand availablePartitionsForTopic 的问题（而不是 DefaultPartitioner 如何使用它们来分配分区），代码是唯一的文档

看看org.apache.kafka.common.Cluster、

    this.partitionsByTopic = new HashMap<>(partsForTopic.size());
    this.availablePartitionsByTopic = new HashMap<>(partsForTopic.size());
    for (Map.Entry<String, List<PartitionInfo>> entry : partsForTopic.entrySet()) {
        String topic = entry.getKey();
        List<PartitionInfo> partitionList = entry.getValue();
        this.partitionsByTopic.put(topic, Collections.unmodifiableList(partitionList));
        List<PartitionInfo> availablePartitions = new ArrayList<>();
        for (PartitionInfo part : partitionList) {
            if (part.leader() != null)
                availablePartitions.add(part);
            }
        this.availablePartitionsByTopic.put(topic, Collections.unmodifiableList(availablePartitions));
    }

如您所见，两者之间的区别因素是领导者的可用性

Kafka 中主题的分区和可用分区有什么区别？

What is the difference between partitions & available partitions for topic in Kafka?

apache-kafka

kafka-producer-api