找到:org.apache.spark.sql.Dataset[(Double, Double)] 需要:org.apache.spark.rdd.RDD[(Double, Double)]
found: org.apache.spark.sql.Dataset[(Double, Double)] required: org.apache.spark.rdd.RDD[(Double, Double)]
我收到以下错误
found : org.apache.spark.sql.Dataset[(Double, Double)]
required: org.apache.spark.rdd.RDD[(Double, Double)]
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)
关于以下代码:
val testScoreAndLabel = testResults.
select("Label","ModelProbability").
map{ case Row(l:Double,p:Vector) => (p(1),l) }
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)
从错误来看,testScoreAndLabel
似乎是 sql.Dataset
类型,但 BinaryClassificationMetrics
需要一个 RDD
。
如何将 sql.Dataset
转换为 RDD
?
我会做这样的事情
val testScoreAndLabel = testResults.
select("Label","ModelProbability").
map{ case Row(l:Double,p:Vector) => (p(1),l) }
现在只需执行 testScoreAndLabel.rdd
即可将 testScoreAndLabel
转换为 RDD
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel.rdd)
我收到以下错误
found : org.apache.spark.sql.Dataset[(Double, Double)]
required: org.apache.spark.rdd.RDD[(Double, Double)]
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)
关于以下代码:
val testScoreAndLabel = testResults.
select("Label","ModelProbability").
map{ case Row(l:Double,p:Vector) => (p(1),l) }
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)
从错误来看,testScoreAndLabel
似乎是 sql.Dataset
类型,但 BinaryClassificationMetrics
需要一个 RDD
。
如何将 sql.Dataset
转换为 RDD
?
我会做这样的事情
val testScoreAndLabel = testResults.
select("Label","ModelProbability").
map{ case Row(l:Double,p:Vector) => (p(1),l) }
现在只需执行 testScoreAndLabel.rdd
testScoreAndLabel
转换为 RDD
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel.rdd)