Kafka with Spark throw Could not initialize class kafka.utils.Log4jController 错误

Kafka with Spark throw Could not initialize class kafka.utils.Log4jController Error

我正在尝试使用 Apache spark 在 java 中编写一个 kafka 消费者。由于某些 Log4jController 错误,代码未执行。不知道我错过了什么。

pom.xml文件如下:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>2.3.0</version>
</dependency>
<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-api</artifactId>
    <version>1.7.25</version>
</dependency>
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-streaming_2.11</artifactId>
  <version>2.3.0</version>
  <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
    <version>2.3.0</version>
    <scope>provided</scope>
</dependency>
<dependency>
  <groupId>org.apache.kafka</groupId>
  <artifactId>kafka_2.11</artifactId>
  <version>1.0.0</version>
    <exclusions>
        <exclusion>
            <groupId>org.apache.zookeeper</groupId>
            <artifactId>zookeeper</artifactId>
        </exclusion>
        <exclusion>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
        </exclusion>
    </exclusions>
</dependency>

出现以下错误

5645 [dag-scheduler-event-loop] INFO  org.apache.spark.scheduler.DAGScheduler  - ResultStage 11 (start at RuleEngine.java:431) failed in 0.094 s due to Job aborted due to stage failure: Task 0 in stage 11.0 failed 1 times, most recent failure: Lost task 0.0 in stage 11.0 (TID 8, localhost, executor driver): java.lang.NoClassDefFoundError: Could not initialize class kafka.utils.Log4jController$

编辑:

我可以通过更改 pom.xml

中的 kafka 客户端版本来解决问题
      <dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>2.11.0</version>
  </dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>2.3.0</version>
</dependency>
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-streaming_2.11</artifactId>
  <version>2.3.0</version>
  <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
    <version>2.3.0</version>
</dependency>
<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka_2.11</artifactId>
    <version>0.8.2.2</version>
</dependency>

你的主要问题是抛出 noClassDefFoundError:

How to resolve java.lang.NoClassDefFoundError:

  1. Class is not available in Java Classpath.
  2. You might be running your program using jar command and class was not defined in manifest file's ClassPath attribute.
  3. Any start-up script is overriding Classpath environment variable.
  4. Because NoClassDefFoundError is a sub class of java.lang.LinkageError it can also come if one of it dependency like native library may not available.
  5. Check for java.lang.ExceptionInInitializerError in your log file. NoClassDefFoundError due to failure of static initialization is quite common.
  6. If you are working in J2EE environment than visibility of Class among multiple Classloaders can also cause java.lang.NoClassDefFoundError, see examples and scenario section for detailed discussion.

您可以按照此 link 了解更多详情: noClassDefFoundError

检查您的 pom,问题似乎是您使用的是 kafka 1.0.0,但使用的是 spark-streaming-kafka-0-8,它需要 kafka 0.8。 的确,searching for kafka.utils.Log4jController 显示它是版本 0.8.1 和 0.8.2 中 kafka-clients 库的一部分,但在更高版本中不是。我不是 Spark 方面的专家,但我认为您只需要找到与您的 kafka 版本相匹配的 spark-streaming-kafka 库版本。希望有帮助