InvalidStateStoreException:状态存储未在 Kafka 流中打开

InvalidStateStoreException: the state store is not open in Kafka streams

StreamsBuilder builder = new StreamsBuilder();

    Map<String, ?> serdeConfig = Collections.singletonMap(SCHEMA_REGISTRY_URL_CONFIG, schemaRegistryUrl);

    Serde keySerde= getSerde(keyClass);
    keySerde.configure(serdeConfig,true);

    Serde valueSerde = getSerde(valueClass);
    valueSerde.configure(serdeConfig,false);

    StoreBuilder<KeyValueStore<K,V>> store =
        Stores.keyValueStoreBuilder(
            Stores.persistentKeyValueStore("mystore"),
            keySerde,valueSerde).withCachingEnabled();

    builder.addGlobalStore(store,"mytopic", Consumed.with(keySerde,valueSerde),this::processMessage);

    streams=new KafkaStreams(builder.build(),properties);

    registerShutdownHook();

    streams.start();

    readOnlyKeyValueStore = waitUntilStoreIsQueryable("mystore", QueryableStoreTypes.<Object, V>keyValueStore(), streams);


private <T> T waitUntilStoreIsQueryable(final String storeName,
      final QueryableStoreType<T> queryableStoreType,
      final KafkaStreams streams) {

    // 25 seconds
    long timeout=250;

    while (timeout>0) {
      try {
        timeout--;
        return streams.store(storeName, queryableStoreType);
      } catch (InvalidStateStoreException ignored) {
        // store not yet ready for querying
        try {
          Thread.sleep(100);
        } catch (InterruptedException e) {
          logger.error(e);
        }
      }
    }
    throw new StreamsException("ReadOnlyKeyValueStore is not queryable within 25 seconds");
  }

错误如下:

19:42:35.049 [my_component.app-91fa5d9f-aba8-4419-a063-93635903ff5d-GlobalStreamThread] ERROR org.apache.kafka.streams.processor.internals.GlobalStreamThread$StateConsumer - global-stream-thread [my_component.app-91fa5d9f-aba8-4419-a063-93635903ff5d-GlobalStreamThread] Updating global state failed. You can restart KafkaStreams to recover from this error.
org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions: {my_component-0=6}
    at org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:990) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:491) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1269) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1200) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1176) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.streams.processor.internals.GlobalStreamThread$StateConsumer.pollAndUpdate(GlobalStreamThread.java:239) [kafka-streams-2.3.0.jar:?]
    at org.apache.kafka.streams.processor.internals.GlobalStreamThread.run(GlobalStreamThread.java:290) [kafka-streams-2.3.0.jar:?]
19:42:35.169 [my_component.app-91fa5d9f-aba8-4419-a063-93635903ff5d-GlobalStreamThread] ERROR org.apache.kafka.streams.KafkaStreams - stream-client [my_component.app-91fa5d9f-aba8-4419-a063-93635903ff5d] Global thread has died. The instance will be in error state and should be closed.
19:42:35.169 [my_component.app-91fa5d9f-aba8-4419-a063-93635903ff5d-GlobalStreamThread] ERROR org.apache.zookeeper.server.NIOServerCnxnFactory - Thread Thread[my_component.app-91fa5d9f-aba8-4419-a063-93635903ff5d-GlobalStreamThread,5,main] died
org.apache.kafka.streams.errors.StreamsException: Updating global state failed. You can restart KafkaStreams to recover from this error.
    at org.apache.kafka.streams.processor.internals.GlobalStreamThread$StateConsumer.pollAndUpdate(GlobalStreamThread.java:250) ~[kafka-streams-2.3.0.jar:?]
    at org.apache.kafka.streams.processor.internals.GlobalStreamThread.run(GlobalStreamThread.java:290) ~[kafka-streams-2.3.0.jar:?]
Caused by: org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions: {my_component-0=6}
    at org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:990) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:491) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1269) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1200) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1176) ~[kafka-clients-2.2.1.jar:?]
    at org.apache.kafka.streams.processor.internals.GlobalStreamThread$StateConsumer.pollAndUpdate(GlobalStreamThread.java:239) ~[kafka-streams-2.3.0.jar:?]
    ... 1 more

org.apache.kafka.streams.errors.InvalidStateStoreException: State store is not available anymore and may have been migrated to another instance; please re-discover its location from the state metadata.

    at org.apache.kafka.streams.state.internals.CompositeReadOnlyKeyValueStore.get(CompositeReadOnlyKeyValueStore.java:60)

我看到两个不同的异常。

  1. InvalidStateStoreException - 商店未打开

  2. InvalidStateStoreException - 存储不再可用并且可能已迁移到另一个实例

我在 Windows 上只有一个流应用程序实例 运行 具有应用程序 ID。

从上面的核心来看,我一直在等待商店可查询,但我仍然得到商店未打开并且商店可能不可用。

异常的可能原因(及解决方法)是什么?

首先,上面的代码写对了吗?

OffsetOutOfRangeException 表示 .checkpoint 文件中存储在状态中的偏移量超出了 Kafka 集群中主题的偏移量。

清除和/或重新创建主题时会发生这种情况。它可能不包含检查点中给定偏移量的那么多消息。

我发现,重置 .checkpoint 文件会有帮助。 .checkpoint 文件将是这样的。

0
1
my_component 0  6
my_component 1  0

这里0是partition,6是offset。同理,1为partition,0为offset。

异常中的描述my_component-0-6表示my_component主题的第0个分区的第6个偏移量超出范围。

由于主题重新创建,第6个偏移量不存在。所以把 6 改成 0.


需要注意的是,在对Kafka进行单元测试时,必须在测试完成后清理state目录,因为测试完成后你的嵌入式Kafka集群及其主题不存在,因此它是在您的状态存储中保留偏移量没有意义(因为它们会变得陈旧)。

因此,请确保在测试后清理您的状态目录(通常为 /tmp/kafka-streams 或 Windows C:\tmp\kafka-streams)。

此外,重置检查点文件只是一种解决方法,并不是生产中的理想解决方案。


在生产中,如果state store与其对应的topic不兼容(即偏移量超出范围),则意味着存在一些损坏,可能有人删除并重新创建了主题。

在这种情况下,我认为清理可能是唯一可能的解决方案。因为,您的状态存储包含 陈旧 信息,因此不再有效(就新主题而言)。