完美的精度、召回率和 f1 分数,但预测不佳
Perfect precision, recall and f1-score, yet bad prediction
使用 scikit-learn 对二元问题进行分类。获得完美 classification_report
(全 1)。然而预测给出 0.36
。怎么可能?
我熟悉不平衡的标签。但我不认为这里是这种情况,因为 f1
和其他分数列以及混淆矩阵表示满分。
# Set aside the last 19 rows for prediction.
X1, X_Pred, y1, y_Pred = train_test_split(X, y, test_size= 19,
shuffle = False, random_state=None)
X_train, X_test, y_train, y_test = train_test_split(X1, y1,
test_size= 0.4, stratify = y1, random_state=11)
clcv = DecisionTreeClassifier()
scorecv = cross_val_score(clcv, X1, y1, cv=StratifiedKFold(n_splits=4),
scoring= 'f1') # to balance precision/recall
clcv.fit(X1, y1)
y_predict = clcv.predict(X1)
cm = confusion_matrix(y1, y_predict)
cm_df = pd.DataFrame(cm, index = ['0','1'], columns = ['0','1'] )
print(cm_df)
print(classification_report( y1, y_predict ))
print('Prediction score:', clcv.score(X_Pred, y_Pred)) # unseen data
输出:
confusion:
0 1
0 3011 0
1 0 44
precision recall f1-score support
False 1.00 1.00 1.00 3011
True 1.00 1.00 1.00 44
micro avg 1.00 1.00 1.00 3055
macro avg 1.00 1.00 1.00 3055
weighted avg 1.00 1.00 1.00 3055
Prediction score: 0.36
问题是你过拟合了。
有很多代码没有用到,我们来删减一下:
# Set aside the last 19 rows for prediction.
X1, X_Pred, y1, y_Pred = train_test_split(X, y, test_size= 19,
shuffle = False, random_state=None)
clcv = DecisionTreeClassifier()
clcv.fit(X1, y1)
y_predict = clcv.predict(X1)
cm = confusion_matrix(y1, y_Pred)
cm_df = pd.DataFrame(cm, index = ['0','1'], columns = ['0','1'] )
print(cm_df)
print(classification_report( y1, y_Pred ))
print('Prediction score:', clcv.score(X_Pred, y_Pred)) # unseen data
很明显,这里没有交叉验证,预测分数低的明显原因是决策树分类器的过度拟合。
使用交叉验证的分数,您应该可以直接看到问题。
使用 scikit-learn 对二元问题进行分类。获得完美 classification_report
(全 1)。然而预测给出 0.36
。怎么可能?
我熟悉不平衡的标签。但我不认为这里是这种情况,因为 f1
和其他分数列以及混淆矩阵表示满分。
# Set aside the last 19 rows for prediction.
X1, X_Pred, y1, y_Pred = train_test_split(X, y, test_size= 19,
shuffle = False, random_state=None)
X_train, X_test, y_train, y_test = train_test_split(X1, y1,
test_size= 0.4, stratify = y1, random_state=11)
clcv = DecisionTreeClassifier()
scorecv = cross_val_score(clcv, X1, y1, cv=StratifiedKFold(n_splits=4),
scoring= 'f1') # to balance precision/recall
clcv.fit(X1, y1)
y_predict = clcv.predict(X1)
cm = confusion_matrix(y1, y_predict)
cm_df = pd.DataFrame(cm, index = ['0','1'], columns = ['0','1'] )
print(cm_df)
print(classification_report( y1, y_predict ))
print('Prediction score:', clcv.score(X_Pred, y_Pred)) # unseen data
输出:
confusion:
0 1
0 3011 0
1 0 44
precision recall f1-score support
False 1.00 1.00 1.00 3011
True 1.00 1.00 1.00 44
micro avg 1.00 1.00 1.00 3055
macro avg 1.00 1.00 1.00 3055
weighted avg 1.00 1.00 1.00 3055
Prediction score: 0.36
问题是你过拟合了。
有很多代码没有用到,我们来删减一下:
# Set aside the last 19 rows for prediction.
X1, X_Pred, y1, y_Pred = train_test_split(X, y, test_size= 19,
shuffle = False, random_state=None)
clcv = DecisionTreeClassifier()
clcv.fit(X1, y1)
y_predict = clcv.predict(X1)
cm = confusion_matrix(y1, y_Pred)
cm_df = pd.DataFrame(cm, index = ['0','1'], columns = ['0','1'] )
print(cm_df)
print(classification_report( y1, y_Pred ))
print('Prediction score:', clcv.score(X_Pred, y_Pred)) # unseen data
很明显,这里没有交叉验证,预测分数低的明显原因是决策树分类器的过度拟合。
使用交叉验证的分数,您应该可以直接看到问题。