DecisionTree 模型的准确度为零
DecisionTree model has Zero Accuracy
在 basic dataset (二维数组 Hours_Studied 和 Test_Grade)上训练模型
并有一些预测,但是当我尝试计算 accuracy_score 时,它总是 0.0
我猜问题出在拆分后的数组形状上
import pandas as pd
import numpy as np
df = pd.read_csv('c:/Rawdata/grade2.csv', header=0)
print ('Raw Dataset Lenght:', len(df))
print ('Raw Dataset Shape:', df.shape)
# raw dataset info output is "Raw Dataset Lenght: 9" and "Raw Dataset Shape: (9, 2)"
from sklearn.model_selection import train_test_split
X = np.array(df['Hours_Studied']).reshape(-1, 1)
y = df['Test_Grade']
print ('Processed Dataset shape', X.shape, y.shape)
# Processed dataset output is "(9, 1) (9,)"
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=100)
而不是这个
from sklearn.tree import DecisionTreeClassifier
tree = DecisionTreeClassifier(criterion = 'entropy', random_state=100)
新代码
from sklearn.tree import DecisionTreeRegressor
tree = DecisionTreeRegressor(random_state=100)
这里没有变化
tree.fit(X_train, y_train)
tree_pred = tree.predict(X_test)
print ('tree predicted array is', tree_pred)
# output is "[57 96 79]"
而不是accuracy_score
from sklearn.metrics import accuracy_score
用这个
from sklearn.metrics import r2_score
print('current y_test is ', '\n', y_test)
#output is
# 1 66
#6 91
#5 81
#Name: Test_Grade, dtype: int64
而不是这个
print('Accuracy tree is', accuracy_score(y_test, tree_pred))
# output is "Accuracy tree is 0.0"
现在我们有
print('Accuracy tree is', r2_score(y_test, tree_pred)*100)
# output is "Accuracy tree is 65.26315789473685"
零精度问题已解决,谢谢!
当你被赋予离散标签时使用分类树,当你被赋予连续值时使用回归树,就像在这种情况下一样。
在 basic dataset (二维数组 Hours_Studied 和 Test_Grade)上训练模型 并有一些预测,但是当我尝试计算 accuracy_score 时,它总是 0.0
我猜问题出在拆分后的数组形状上
import pandas as pd
import numpy as np
df = pd.read_csv('c:/Rawdata/grade2.csv', header=0)
print ('Raw Dataset Lenght:', len(df))
print ('Raw Dataset Shape:', df.shape)
# raw dataset info output is "Raw Dataset Lenght: 9" and "Raw Dataset Shape: (9, 2)"
from sklearn.model_selection import train_test_split
X = np.array(df['Hours_Studied']).reshape(-1, 1)
y = df['Test_Grade']
print ('Processed Dataset shape', X.shape, y.shape)
# Processed dataset output is "(9, 1) (9,)"
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=100)
而不是这个
from sklearn.tree import DecisionTreeClassifier
tree = DecisionTreeClassifier(criterion = 'entropy', random_state=100)
新代码
from sklearn.tree import DecisionTreeRegressor
tree = DecisionTreeRegressor(random_state=100)
这里没有变化
tree.fit(X_train, y_train)
tree_pred = tree.predict(X_test)
print ('tree predicted array is', tree_pred)
# output is "[57 96 79]"
而不是accuracy_score
from sklearn.metrics import accuracy_score
用这个
from sklearn.metrics import r2_score
print('current y_test is ', '\n', y_test)
#output is
# 1 66
#6 91
#5 81
#Name: Test_Grade, dtype: int64
而不是这个
print('Accuracy tree is', accuracy_score(y_test, tree_pred))
# output is "Accuracy tree is 0.0"
现在我们有
print('Accuracy tree is', r2_score(y_test, tree_pred)*100)
# output is "Accuracy tree is 65.26315789473685"
零精度问题已解决,谢谢!
当你被赋予离散标签时使用分类树,当你被赋予连续值时使用回归树,就像在这种情况下一样。