完美的精度、召回率和 f1 分数,但预测不佳

Perfect precision, recall and f1-score, yet bad prediction

使用 scikit-learn 对二元问题进行分类。获得完美 classification_report(全 1)。然而预测给出 0.36。怎么可能?

我熟悉不平衡的标签。但我不认为这里是这种情况,因为 f1 和其他分数列以及混淆矩阵表示满分。

# Set aside the last 19 rows for prediction.
X1, X_Pred, y1, y_Pred = train_test_split(X, y, test_size= 19, 
                shuffle = False, random_state=None)

X_train, X_test, y_train, y_test = train_test_split(X1, y1, 
         test_size= 0.4, stratify = y1, random_state=11)

clcv = DecisionTreeClassifier()
scorecv = cross_val_score(clcv, X1, y1, cv=StratifiedKFold(n_splits=4), 
                         scoring= 'f1') # to balance precision/recall
clcv.fit(X1, y1)
y_predict = clcv.predict(X1)
cm = confusion_matrix(y1, y_predict)
cm_df = pd.DataFrame(cm, index = ['0','1'], columns = ['0','1'] )
print(cm_df)
print(classification_report( y1, y_predict ))
print('Prediction score:', clcv.score(X_Pred, y_Pred)) # unseen data

输出:

confusion:
      0   1
0  3011   0
1     0  44

              precision    recall  f1-score   support
       False       1.00      1.00      1.00      3011
        True       1.00      1.00      1.00        44

   micro avg       1.00      1.00      1.00      3055
   macro avg       1.00      1.00      1.00      3055
weighted avg       1.00      1.00      1.00      3055

Prediction score: 0.36

问题是你过拟合了。

有很多代码没有用到,我们来删减一下:

# Set aside the last 19 rows for prediction.
X1, X_Pred, y1, y_Pred = train_test_split(X, y, test_size= 19, 
                shuffle = False, random_state=None)

clcv = DecisionTreeClassifier()
clcv.fit(X1, y1)
y_predict = clcv.predict(X1)
cm = confusion_matrix(y1, y_Pred)
cm_df = pd.DataFrame(cm, index = ['0','1'], columns = ['0','1'] )
print(cm_df)
print(classification_report( y1, y_Pred ))
print('Prediction score:', clcv.score(X_Pred, y_Pred)) # unseen data

很明显,这里没有交叉验证,预测分数低的明显原因是决策树分类器的过度拟合。

使用交叉验证的分数,您应该可以直接看到问题。