KafkaStreams 如何确定 GlobalKTable 在引导时是否已完全填充?

How does KafkaStreams determine whether a GlobalKTable is fully populated while bootstrapping?

我用来创建GlobalKTable的话题很活跃。在 KStream-GlobalKTable join 的文档中,我阅读了

The GlobalKTable is fully bootstrapped upon (re)start of a KafkaStreams instance, which means the table is fully populated with all the data in the underlying topic that is available at the time of the startup. The actual data processing begins only once the bootstrapping has completed.

KafkaStreams如何判断是否读取了所有数据?是否读取了时间戳在KafkaStreams实例bootstrap时间以下的所有消息?或者它是否使用某种超时?

无论如何,我想我们最好让基础主题的 retentionlog compaction 正确,否则重启可能需要一段时间.

启动时,Kafka Streams 读取当前的日志结束偏移量,并在加载所有这些数据后完成引导(参见 KIP-99)。

请注意,GlobalKTable 的设计考虑到了 static/rarely 不断变化的数据。

Either way, I guess we better get the retention and log compaction of the underlying topic right or a restart might take a while.

GlobalKTable checkpoints as of 0.11(今天发布)所以重启时引导应该比 0.10.2 快得多。