使用 StratifiedShuffleSplit 时计算召回指标

Question

由于我有一个不平衡的数据集，以下方法将 KNN 分类器与 StratifiedShuffleSplit 结合使用：

def KNN(train_x, train_y):
    skf = StratifiedShuffleSplit()
    scores = []
    for train, test in skf.split(train_x, train_y):
        clf = KNeighborsClassifier(n_neighbors=2, n_jobs=-1)
        clf.fit(train_x.loc[train], train_y.loc[train])
        score = clf.score(train_x.loc[test], train_y.loc[test])
        scores.append(score)

    res = np.asarray(scores).mean()
    print(res)

如何修改 scores 来计算 recall 和 precision 指标而不是默认准确度？

谢谢，

Answer 1

你需要：

sklearn.metrics.recall_score(y_true, y_pred)
sklearn.metrics.precision_score(y_true, y_pred)

from sklearn.metrics import recall_score
from sklearn.metrics import precision_score

def KNN(train_x, train_y):
    skf = StratifiedShuffleSplit()
    scores = []
    scores2 = []
    for train, test in skf.split(train_x, train_y):
        clf = KNeighborsClassifier(n_neighbors=2, n_jobs=-1)
        clf.fit(train_x.loc[train], train_y.loc[train])
        y_pred = clf.predict(train_x.loc[test]) # predict the labels of the test set
        y_true = train_y.loc[test] # get the true labels of the test test
        score = recall_score(y_true, y_pred) # recall estimation
        score2 = precision_score(y_true, y_pred) # precision estimation
        scores.append(score)
        scores2.append(score2)

使用 StratifiedShuffleSplit 时计算召回指标

Calculate recall metric when StratifiedShuffleSplit is used

machine-learning

knn

python-3.x

scikit-learn