为什么案例 class returns spark sql 中的数据框
Why the case class returns a dataframe in spark sql
以下仅以联盟为例
我正在阅读 spark sql 源代码,并卡在这段代码上,它位于 DataFrame.scala
def unionAll(other: DataFrame): DataFrame = Union(logicalPlan, other.logicalPlan)
而联盟是这样定义的 class
case class Union(left: LogicalPlan, right: LogicalPlan) extends BinaryNode {...}
我很困惑,结果怎么能被当作DataFrame类型的实例呢?
好吧,如果 Scala 中有什么地方不清楚,那一定是 implicit
。首先让我们看一下 BinaryNode
node definition:
abstract class BinaryNode extends LogicalPlan
因为 LogicalPlan
结合 SQLContext
is the only thing required to create a DataFrame
it looks like a good place for a conversion. And here it is:
@inline private implicit def logicalPlanToDataFrame(logicalPlan: LogicalPlan):
DataFrame = {
new DataFrame(sqlContext, logicalPlan)
}
实际上此转换已在 1.6.0 中被 SPARK-11513 删除,描述如下:
DataFrame has an internal implicit conversion that turns a LogicalPlan into a DataFrame. This has been fairly confusing to a few new contributors. Since it doesn't buy us much, we should just remove that implicit conversion.
以下仅以联盟为例
我正在阅读 spark sql 源代码,并卡在这段代码上,它位于 DataFrame.scala
def unionAll(other: DataFrame): DataFrame = Union(logicalPlan, other.logicalPlan)
而联盟是这样定义的 class
case class Union(left: LogicalPlan, right: LogicalPlan) extends BinaryNode {...}
我很困惑,结果怎么能被当作DataFrame类型的实例呢?
好吧,如果 Scala 中有什么地方不清楚,那一定是 implicit
。首先让我们看一下 BinaryNode
node definition:
abstract class BinaryNode extends LogicalPlan
因为 LogicalPlan
结合 SQLContext
is the only thing required to create a DataFrame
it looks like a good place for a conversion. And here it is:
@inline private implicit def logicalPlanToDataFrame(logicalPlan: LogicalPlan):
DataFrame = {
new DataFrame(sqlContext, logicalPlan)
}
实际上此转换已在 1.6.0 中被 SPARK-11513 删除,描述如下:
DataFrame has an internal implicit conversion that turns a LogicalPlan into a DataFrame. This has been fairly confusing to a few new contributors. Since it doesn't buy us much, we should just remove that implicit conversion.