如何获得混淆矩阵以输出形状一致的数组 (2x2) 以进行二元分类？

Question

我正在循环并为我拥有的每个数据集创建 tn、fp、fn、tp，对于某些数据集只有 0，我预测只有 0，所以我只 return 一个 1x1 数组用于 tp 但我仍然想要一个 2x2 矩阵 returned 所以我在 python 的以下位期间没有得到 ValueError: not enough values to unpack (expected 4, got 1):

tn, fp, fn, tp = confusion_matrix(metrics_data[label_column],metrics_data[scored_column]).ravel()

解决此问题的最佳方法是什么？

Answer 1

将标签参数添加到您的混淆矩阵命令中，例如

tn, fp, fn, tp = confusion_matrix(
    metrics_data[label_column],
    metrics_data[scored_column], 
    labels=[0, 1]).ravel()

从 the documentation 到 sklearn.metrics.confustion_matrix，labels 是形状 (n_classes) 的 array-like 并定义为：

List of labels to index the matrix. This may be used to reorder or select a subset of labels. If None is given, those that appear at least once in y_true or y_pred are used in sorted order.

由于您提供了 None，confuision_matrix 默认仅使用它在您的数据中实际看到的值。

Answer 2

来自文档：

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html

sklearn.metrics.confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, normalize=None)[source]
Compute confusion matrix to evaluate the accuracy of a classification.

By definition a confusion matrix  is such that  is equal to the number of observations known to be in group  and predicted to be in
group .

Thus in binary classification, the count of true negatives is , false negatives is , true positives is  and false positives is .

Read more in the User Guide.

Parameters
y_truearray-like of shape (n_samples,)
Ground truth (correct) target values.

y_predarray-like of shape (n_samples,)
Estimated targets as returned by a classifier.

labelsarray-like of shape (n_classes), default=None
List of labels to index the matrix. This may be used to reorder or select a subset of labels. If None is given, those that appear at
least once in y_true or y_pred are used in sorted order.

无论输入预测如何，您都可以使用 labels 参数强制大小相等：

confusion_matrix(true, pred, labels=[1,2,3])

如何获得混淆矩阵以输出形状一致的数组 (2x2) 以进行二元分类？

How do I get a confusion matrix to output a consistently shaped array (2x2) for a binary classification?

python

scikit-learn