RandomForestClassifier 没有属性变换,那么如何得到预测呢?
RandomForestClassifier has no attribute transform, so how to get predictions?
如何从 RandomForestClassifier 中获得预测结果?松散地遵循最新的文档 here,我的代码看起来像...
# Split the data into training and test sets (30% held out for testing)
SPLIT_SEED = 64 # some const seed just for reproducibility
TRAIN_RATIO = 0.75
(trainingData, testData) = df.randomSplit([TRAIN_RATIO, 1-TRAIN_RATIO], seed=SPLIT_SEED)
print(f"Training set ({trainingData.count()}):")
trainingData.show(n=3)
print(f"Test set ({testData.count()}):")
testData.show(n=3)
# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)
rf.fit(trainingData)
#print(rf.featureImportances)
preds = rf.transform(testData)
当运行这个时候,我得到错误
AttributeError: 'RandomForestClassifier' object has no attribute 'transform'
检查 python api docs,我没有发现它看起来与从训练模型生成预测有关(也没有与此相关的特征重要性)。没有太多的 mllib 经验,所以不知道该怎么做。有更多经验的人知道在这里做什么吗?
通过仔细查看文档
>>> model = rf.fit(td)
>>> model.featureImportances
SparseVector(1, {0: 1.0})
>>> allclose(model.treeWeights, [1.0, 1.0, 1.0])
True
>>> test0 = spark.createDataFrame([(Vectors.dense(-1.0),)], ["features"])
>>> result = model.transform(test0).head()
>>> result.prediction
您会注意到 rf.fit return 拟合模型不同于原始的 RandomForestClassifier class。
并且该模型将具有转换方法和特征重要性
所以在你的代码中
# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)
model = rf.fit(trainingData)
#print(rf.featureImportances)
preds = model.transform(testData)
如何从 RandomForestClassifier 中获得预测结果?松散地遵循最新的文档 here,我的代码看起来像...
# Split the data into training and test sets (30% held out for testing)
SPLIT_SEED = 64 # some const seed just for reproducibility
TRAIN_RATIO = 0.75
(trainingData, testData) = df.randomSplit([TRAIN_RATIO, 1-TRAIN_RATIO], seed=SPLIT_SEED)
print(f"Training set ({trainingData.count()}):")
trainingData.show(n=3)
print(f"Test set ({testData.count()}):")
testData.show(n=3)
# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)
rf.fit(trainingData)
#print(rf.featureImportances)
preds = rf.transform(testData)
当运行这个时候,我得到错误
AttributeError: 'RandomForestClassifier' object has no attribute 'transform'
检查 python api docs,我没有发现它看起来与从训练模型生成预测有关(也没有与此相关的特征重要性)。没有太多的 mllib 经验,所以不知道该怎么做。有更多经验的人知道在这里做什么吗?
通过仔细查看文档
>>> model = rf.fit(td)
>>> model.featureImportances
SparseVector(1, {0: 1.0})
>>> allclose(model.treeWeights, [1.0, 1.0, 1.0])
True
>>> test0 = spark.createDataFrame([(Vectors.dense(-1.0),)], ["features"])
>>> result = model.transform(test0).head()
>>> result.prediction
您会注意到 rf.fit return 拟合模型不同于原始的 RandomForestClassifier class。
并且该模型将具有转换方法和特征重要性
所以在你的代码中
# Train a RandomForest model.
rf = RandomForestClassifier(labelCol="labels", featuresCol="features", numTrees=36)
model = rf.fit(trainingData)
#print(rf.featureImportances)
preds = model.transform(testData)