Kafka 与 Confluent Kubernetes Helm Charts = Schema Registry WakeupException

Kafka with Confluent Kubernetes Helm Charts = Schema Registry WakeupException

我的主要问题:为什么架构注册表崩溃了?

外围问题:如果我为每个 pods 配置了一个服务器,为什么每个 zookeeper/kafka/schema-registry 都启动了两个?其他一切看起来基本正确吗?

➜  helm repo update
<snip>

➜  helm install --values values.yaml --name my-confluent-oss confluentinc/cp-helm-charts
<snip>

➜  helm list
NAME                REVISION    UPDATED                     STATUS      CHART                   APP VERSION NAMESPACE
my-confluent-oss    1           Sat Oct 20 19:09:08 2018    DEPLOYED    cp-helm-charts-0.1.0    1.0         default  

➜  kubectl get pods
NAME                                                   READY     STATUS             RESTARTS   AGE
my-confluent-oss-cp-kafka-0                            2/2       Running            0          20m
my-confluent-oss-cp-schema-registry-59d8877584-c2jc7   1/2       CrashLoopBackOff   7          20m
my-confluent-oss-cp-zookeeper-0                        2/2       Running            0          20m

我的values.yaml如下。我已经用 helm install --debug --dry-run 测试过了。我只是禁用持久性,设置单个服务器(这是 VM 中 运行 的开发设置),并暂时禁用额外服务,直到我获得基础知识:

cp-kafka:
  brokers: 1
  persistence:
    enabled: false

  cp-zookeeper:
    persistence:
      enabled: false
    servers: 1

cp-zookeeper:
  persistence:
    enabled: false
  servers: 1

cp-kafka-connect:
  enabled: false

cp-kafka-rest:
  enabled: false

cp-ksql-server:
  enabled: false

以下是失败的架构注册表的日志:

➜  kubectl logs my-confluent-oss-cp-schema-registry-59d8877584-c2jc7 cp-schema-registry-server

<snip>
[2018-10-21 00:28:14,738] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,738] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,751] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:28:14,753] INFO Initialized last consumed offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:28:14,756] INFO [kafka-store-reader-thread-_schemas]: Starting (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:28:14,800] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=my-confluent-oss] Resetting offset for partition _schemas-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2018-10-21 00:28:14,821] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:28:14,857] INFO Wait to catch up until the offset of the last message at 7 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2018-10-21 00:28:14,930] INFO Joining schema registry with Kafka-based coordination (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2018-10-21 00:28:14,939] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,940] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2018-10-21 00:28:14,953] INFO Cluster ID: ofJRwpXzRn-ltDn8b_6h3A (org.apache.kafka.clients.Metadata)
[2018-10-21 00:29:14,945] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
    at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:220)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:63)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:41)
    at io.confluent.rest.Application.createServer(Application.java:169)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete
    at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector.init(KafkaGroupMasterElector.java:202)
    at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:215)
    ... 4 more
[2018-10-21 00:29:14,948] INFO Shutting down schema registry (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2018-10-21 00:29:14,949] INFO [kafka-store-reader-thread-_schemas]: Shutting down (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,950] INFO [kafka-store-reader-thread-_schemas]: Stopped (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,951] INFO [kafka-store-reader-thread-_schemas]: Shutdown completed (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,953] INFO KafkaStoreReaderThread shutdown complete. (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2018-10-21 00:29:14,953] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2018-10-21 00:29:14,959] ERROR Unexpected exception in schema registry group processing thread (io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector)
org.apache.kafka.common.errors.WakeupException
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:498)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:284)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161)
    at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:243)
    at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.ensureCoordinatorReady(SchemaRegistryCoordinator.java:207)
    at io.confluent.kafka.schemaregistry.masterelector.kafka.SchemaRegistryCoordinator.poll(SchemaRegistryCoordinator.java:97)
    at io.confluent.kafka.schemaregistry.masterelector.kafka.KafkaGroupMasterElector.run(KafkaGroupMasterElector.java:192)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

我正在使用 minikube 0.30.0 和一个全新、干净的 minikube 虚拟机:

➜  kubectl version

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-22T05:40:33Z", GoVersion:"go1.9.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

您的架构注册表无法加入您的 Kafka 组。您必须检查配置,您的架构注册表最初需要执行领导者选举,而领导者选举可以通过 Zookeeper or Kafka.

Helm chart 使用 Kafka leader election, and you can also see that you can manually pass the Kafka broker parameter or it picks it from .Values.kafka.bootstrapServers, but also the value for .bootstrapServers 安装架构注册表似乎是空的。您可以通过简单地 运行 查看部署中的配置值,例如:

$ kubectl get deployment my-confluent-oss-cp-schema-registry -o=yaml

然后您可以将其更改为指向内部 Kubernetes my-confluent-oss-cp-kafka 服务端点:

$ kubectl edit deployment cp-schema-registry

另外,请注意,在撰写本文时,cp-helm-charts 处于开发人员预览阶段,因此使用它需要您自担风险。

您可以配置的另一个参数是 SCHEMA_REGISTRY_KAFKASTORE_INIT_TIMEOUT_CONFIG since this is 您看到错误的位置。因此,Kafka Schema 注册表在尝试连接到 Kafka 商店时可能会超时。 (可能与 minikube 有关)。奇怪的是它应该重试。