为什么使用 case class 映射到 DataFrame 会因 "Unable to find encoder for type stored in a Dataset" 而失败?

Why does mapping over a DataFrame using a case class fail with "Unable to find encoder for type stored in a Dataset"?

我已经导入了spark.implicits._但是我还是报错

Error:(27, 33) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.

我有一个案例 class 比如:

case class User(name: String, dept: String)

我正在使用以下方法将 Dataframe 转换为数据集:

val ds = df.map { row=> User(row.getString(0), row.getString(1) }

val ds = df.as[User]

此外,当我在 Spark-shell 中尝试相同的代码时,我没有收到任何错误,只有当我通过 IntelliJ 运行 或提交作业时,我才会收到此错误。

有什么原因吗?

将案例 class 的声明移出范围就成功了!

代码结构如下:

package main.scala.UserAnalytics

// case class *outside* the main object
case class User(name: string, dept: String)

object UserAnalytics extends App {
    ...
    ds = df.map { row => User(row.getString(0), row.getString(1)) }
}