Zeppelin + Spark: Reading Parquet from S3 throws NoSuchMethodError: com.fasterxml.jackson
Zeppelin + Spark: Reading Parquet from S3 throws NoSuchMethodError: com.fasterxml.jackson
使用主要下载的 Zeppelin 0.7.2 二进制文件和 Spark 2.1.0 w/ Hadoop 2.6,以下段落:
val df = spark.read.parquet(DATA_URL).filter(FILTER_STRING).na.fill("")
生成以下内容:
java.lang.NoSuchMethodError: com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49)
at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala)
at com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule$class.$init$(ScalaNumberDeserializersModule.scala:61)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.<init>(DefaultScalaModule.scala:20)
at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<init>(DefaultScalaModule.scala:37)
at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<clinit>(DefaultScalaModule.scala)
at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:701)
at org.apache.spark.SparkContext.parallelize(SparkContext.scala:715)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.mergeSchemasInParallel(ParquetFileFormat.scala:594)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.inferSchema(ParquetFileFormat.scala:235)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun.apply(DataSource.scala:184)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun.apply(DataSource.scala:184)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$getOrInferFileFormatSchema(DataSource.scala:183)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:387)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:425)
... 47 elided
这个错误不会出现在普通的spark-shell中,只会出现在Zeppelin中。我尝试了以下修复,但没有任何作用:
- 将 jackson 2.6.2 jar 下载到 zeppelin lib 文件夹并重新启动
- 将 jackson 2.9 依赖项从 maven 存储库添加到解释器设置
- 正在从 zeppelin lib 文件夹中删除 jackson jars
谷歌搜索没有发现类似情况。请不要犹豫,询问更多信息,或提出建议。谢谢!
另一种方法是将其直接包含在笔记本单元格中:
%dep
z.load("com.fasterxml.jackson.core:jackson-core:2.6.2")
我遇到了同样的问题。我添加了 com.amazonaws:aws-java-sdk
和 org.apache.hadoop:hadoop-aws
作为 Spark 解释器的依赖项。这些依赖项引入了它们自己的 com.fasterxml.jackson.core:*
版本并与 Spark 的相冲突。
您还必须从其他依赖项中排除 com.fasterxml.jackson.core:*
,这是一个示例 ${ZEPPELIN_HOME}/conf/interpreter.json
Spark 解释器依赖项部分:
"dependencies": [
{
"groupArtifactVersion": "com.amazonaws:aws-java-sdk:1.7.4",
"local": false,
"exclusions": ["com.fasterxml.jackson.core:*"]
},
{
"groupArtifactVersion": "org.apache.hadoop:hadoop-aws:2.7.1",
"local": false,
"exclusions": ["com.fasterxml.jackson.core:*"]
}
]
使用主要下载的 Zeppelin 0.7.2 二进制文件和 Spark 2.1.0 w/ Hadoop 2.6,以下段落:
val df = spark.read.parquet(DATA_URL).filter(FILTER_STRING).na.fill("")
生成以下内容:
java.lang.NoSuchMethodError: com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49)
at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala)
at com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule$class.$init$(ScalaNumberDeserializersModule.scala:61)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.<init>(DefaultScalaModule.scala:20)
at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<init>(DefaultScalaModule.scala:37)
at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<clinit>(DefaultScalaModule.scala)
at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:701)
at org.apache.spark.SparkContext.parallelize(SparkContext.scala:715)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.mergeSchemasInParallel(ParquetFileFormat.scala:594)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.inferSchema(ParquetFileFormat.scala:235)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun.apply(DataSource.scala:184)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun.apply(DataSource.scala:184)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$getOrInferFileFormatSchema(DataSource.scala:183)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:387)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:425)
... 47 elided
这个错误不会出现在普通的spark-shell中,只会出现在Zeppelin中。我尝试了以下修复,但没有任何作用:
- 将 jackson 2.6.2 jar 下载到 zeppelin lib 文件夹并重新启动
- 将 jackson 2.9 依赖项从 maven 存储库添加到解释器设置
- 正在从 zeppelin lib 文件夹中删除 jackson jars
谷歌搜索没有发现类似情况。请不要犹豫,询问更多信息,或提出建议。谢谢!
另一种方法是将其直接包含在笔记本单元格中:
%dep
z.load("com.fasterxml.jackson.core:jackson-core:2.6.2")
我遇到了同样的问题。我添加了 com.amazonaws:aws-java-sdk
和 org.apache.hadoop:hadoop-aws
作为 Spark 解释器的依赖项。这些依赖项引入了它们自己的 com.fasterxml.jackson.core:*
版本并与 Spark 的相冲突。
您还必须从其他依赖项中排除 com.fasterxml.jackson.core:*
,这是一个示例 ${ZEPPELIN_HOME}/conf/interpreter.json
Spark 解释器依赖项部分:
"dependencies": [
{
"groupArtifactVersion": "com.amazonaws:aws-java-sdk:1.7.4",
"local": false,
"exclusions": ["com.fasterxml.jackson.core:*"]
},
{
"groupArtifactVersion": "org.apache.hadoop:hadoop-aws:2.7.1",
"local": false,
"exclusions": ["com.fasterxml.jackson.core:*"]
}
]