为什么使用 case class 映射到 DataFrame 会因 "Unable to find encoder for type stored in a Dataset" 而失败?
Why does mapping over a DataFrame using a case class fail with "Unable to find encoder for type stored in a Dataset"?
我已经导入了spark.implicits._
但是我还是报错
Error:(27, 33) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
我有一个案例 class 比如:
case class User(name: String, dept: String)
我正在使用以下方法将 Dataframe 转换为数据集:
val ds = df.map { row=> User(row.getString(0), row.getString(1) }
或
val ds = df.as[User]
此外,当我在 Spark-shell
中尝试相同的代码时,我没有收到任何错误,只有当我通过 IntelliJ 运行 或提交作业时,我才会收到此错误。
有什么原因吗?
将案例 class 的声明移出范围就成功了!
代码结构如下:
package main.scala.UserAnalytics
// case class *outside* the main object
case class User(name: string, dept: String)
object UserAnalytics extends App {
...
ds = df.map { row => User(row.getString(0), row.getString(1)) }
}
我已经导入了spark.implicits._
但是我还是报错
Error:(27, 33) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
我有一个案例 class 比如:
case class User(name: String, dept: String)
我正在使用以下方法将 Dataframe 转换为数据集:
val ds = df.map { row=> User(row.getString(0), row.getString(1) }
或
val ds = df.as[User]
此外,当我在 Spark-shell
中尝试相同的代码时,我没有收到任何错误,只有当我通过 IntelliJ 运行 或提交作业时,我才会收到此错误。
有什么原因吗?
将案例 class 的声明移出范围就成功了!
代码结构如下:
package main.scala.UserAnalytics
// case class *outside* the main object
case class User(name: string, dept: String)
object UserAnalytics extends App {
...
ds = df.map { row => User(row.getString(0), row.getString(1)) }
}