sklearn jaccard_score 给出了错误的结果
sklearn jaccard_score giving a wrong result
我已经使用 sklearn.metrics.jaccard_score
从我的 python 模型的二元分类测试中收集参考分数。它输出如下所示,但是当我手动计算 度量时,它会产生另一个值。我是否误解了此函数用法中“jaccard”的含义?还是我用错了? sklearn 函数收集的所有其他指标都返回正确的值。
下面是我的代码,手动测试 jaccard(在计算器中通过比较向量作为集合产生相同的结果,因为我(不是那么多)松了一口气)。
def test(X, y, model):
predictions = model.predict(X, verbose=1).ravel()
report = classification_report(y, predictions, target_names=['nao_doentes', 'doentes'])
confMatrix = confusion_matrix(y, predictions)
tn, fp, fn, tp = confMatrix.ravel()
jaccard = jaccard_score(y, predictions) # Se comportando de forma estranha
print(tn, fp, fn, tp)
print(predictions)
print(y)
print(report)
print(confMatrix)
print("Jaccard by function: {}".format(jaccard))
# Note that in binary classification, recall of the positive class is also known as “sensitivity”;
# recall of the negative class is “specificity”.
dice = ((2*tp) / ((2*tp) + fp + fn))
jaccard = ((tp + tn) / ((2*(tp + tn + fn + fp)) - (tp + tn)))
print(dice)
print("Jaccard by hand: {}".format(jaccard))
然后是输出:
2 0 1 1
[1. 0. 0. 0.]
[1 0 1 0]
precision recall f1-score support
nao_doentes 0.67 1.00 0.80 2
doentes 1.00 0.50 0.67 2
accuracy 0.75 4
macro avg 0.83 0.75 0.73 4
weighted avg 0.83 0.75 0.73 4
[[2 0]
[1 1]]
Jaccard by function: 0.5
0.6666666666666666
Jaccard by hand: 0.6
作为第二个问题,为什么 classification_report
似乎将 nao_doentes
(非病态,葡萄牙语)作为 1 而 doentes
(病态)作为 0?不是应该反着放吗? nao_doentes
在我的集合中设置为 0,doentes
设置为 1(所以在 y 中)。
查看 help page,jaccard 得分定义为:
the size of the intersection divided by the size of the union of two
label sets,
而且他们只看正面 class:
jaccard_score may be a poor metric if there are no positives for some
samples or classes. Jaccard is undefined if there are no true or
predicted labels, and our implementation will return a score of 0 with
a warning.
在你的混淆矩阵中,你有:
intersection = tp # you have 1
union = tp+fp # you have 2
jaccard = intersection / union
应该给你 1 / (1+1) = 0.5 .
您的标签是正确的。你可以转换标签,你会看到你得到了相同的混淆矩阵:
import pandas as pd
labels = pd.Categorical(['nao_doentes','doentes'],categories=['nao_doentes','doentes'])
prediction = [1 ,0 ,0, 0]
y = [1 ,0, 1, 0]
pd.crosstab(labels[y],labels[prediction])
col_0 nao_doentes doentes
row_0
nao_doentes 2 0
doentes 1 1
你的问题中手工计算的 Jaccard 分数不同于使用默认的 scikit-learn jaccard_score
计算的分数,因为你手工使用的方程计算的是 micro-平均 Jaccard 分数,而默认情况下 scikit-learn 版本仅计算正 class(“Doentes”)的分数。
为了了解这是怎么回事,我们可以看看使用默认方法的 sklearn jaccard_score
与手动计算的比较:
import numpy as np
from sklearn.metrics import jaccard_score
y_true = np.array([1, 0, 1, 0])
y_pred = np.array([1, 0, 0, 0])
tp = 1
tn = 2
fp = 0
fn = 1
jaccard_score(y_true, y_pred)
# 0.5
# And we can check this by using the definition of the Jaccard score for the positive class:
tp / (tp + fp + fn)
# 0.5
现在让我们看一下微平均Jaccard得分(这里“微平均”的定义来自scikit-learn documentation:
# scikit-learn:
jaccard_score(y_true, y_pred, average='micro')
# 0.6
# Definition of micro-averaged ("Calculate metrics globally by counting
# the total true positives, false negatives and false positives").
# Here we have to define another set of outcomes but this time with the
# original negative class as the positive class:
tp_0 = 2
fp_0 = 1
tn_0 = 1
fn_0 = 0
(tp+tp_0)/(tp+tp_0+fp+fp_0+fn+fn_0)
# 0.6
# And let's now compare this to the original calculation by hand in the question:
(tp + tn) / ((2*(tp + tn + fn + fp)) - (tp + tn))
# 0.6
我已经使用 sklearn.metrics.jaccard_score
从我的 python 模型的二元分类测试中收集参考分数。它输出如下所示,但是当我手动计算 度量时,它会产生另一个值。我是否误解了此函数用法中“jaccard”的含义?还是我用错了? sklearn 函数收集的所有其他指标都返回正确的值。
下面是我的代码,手动测试 jaccard(在计算器中通过比较向量作为集合产生相同的结果,因为我(不是那么多)松了一口气)。
def test(X, y, model):
predictions = model.predict(X, verbose=1).ravel()
report = classification_report(y, predictions, target_names=['nao_doentes', 'doentes'])
confMatrix = confusion_matrix(y, predictions)
tn, fp, fn, tp = confMatrix.ravel()
jaccard = jaccard_score(y, predictions) # Se comportando de forma estranha
print(tn, fp, fn, tp)
print(predictions)
print(y)
print(report)
print(confMatrix)
print("Jaccard by function: {}".format(jaccard))
# Note that in binary classification, recall of the positive class is also known as “sensitivity”;
# recall of the negative class is “specificity”.
dice = ((2*tp) / ((2*tp) + fp + fn))
jaccard = ((tp + tn) / ((2*(tp + tn + fn + fp)) - (tp + tn)))
print(dice)
print("Jaccard by hand: {}".format(jaccard))
然后是输出:
2 0 1 1
[1. 0. 0. 0.]
[1 0 1 0]
precision recall f1-score support
nao_doentes 0.67 1.00 0.80 2
doentes 1.00 0.50 0.67 2
accuracy 0.75 4
macro avg 0.83 0.75 0.73 4
weighted avg 0.83 0.75 0.73 4
[[2 0]
[1 1]]
Jaccard by function: 0.5
0.6666666666666666
Jaccard by hand: 0.6
作为第二个问题,为什么 classification_report
似乎将 nao_doentes
(非病态,葡萄牙语)作为 1 而 doentes
(病态)作为 0?不是应该反着放吗? nao_doentes
在我的集合中设置为 0,doentes
设置为 1(所以在 y 中)。
查看 help page,jaccard 得分定义为:
the size of the intersection divided by the size of the union of two label sets,
而且他们只看正面 class:
jaccard_score may be a poor metric if there are no positives for some samples or classes. Jaccard is undefined if there are no true or predicted labels, and our implementation will return a score of 0 with a warning.
在你的混淆矩阵中,你有:
intersection = tp # you have 1
union = tp+fp # you have 2
jaccard = intersection / union
应该给你 1 / (1+1) = 0.5 .
您的标签是正确的。你可以转换标签,你会看到你得到了相同的混淆矩阵:
import pandas as pd
labels = pd.Categorical(['nao_doentes','doentes'],categories=['nao_doentes','doentes'])
prediction = [1 ,0 ,0, 0]
y = [1 ,0, 1, 0]
pd.crosstab(labels[y],labels[prediction])
col_0 nao_doentes doentes
row_0
nao_doentes 2 0
doentes 1 1
你的问题中手工计算的 Jaccard 分数不同于使用默认的 scikit-learn jaccard_score
计算的分数,因为你手工使用的方程计算的是 micro-平均 Jaccard 分数,而默认情况下 scikit-learn 版本仅计算正 class(“Doentes”)的分数。
为了了解这是怎么回事,我们可以看看使用默认方法的 sklearn jaccard_score
与手动计算的比较:
import numpy as np
from sklearn.metrics import jaccard_score
y_true = np.array([1, 0, 1, 0])
y_pred = np.array([1, 0, 0, 0])
tp = 1
tn = 2
fp = 0
fn = 1
jaccard_score(y_true, y_pred)
# 0.5
# And we can check this by using the definition of the Jaccard score for the positive class:
tp / (tp + fp + fn)
# 0.5
现在让我们看一下微平均Jaccard得分(这里“微平均”的定义来自scikit-learn documentation:
# scikit-learn:
jaccard_score(y_true, y_pred, average='micro')
# 0.6
# Definition of micro-averaged ("Calculate metrics globally by counting
# the total true positives, false negatives and false positives").
# Here we have to define another set of outcomes but this time with the
# original negative class as the positive class:
tp_0 = 2
fp_0 = 1
tn_0 = 1
fn_0 = 0
(tp+tp_0)/(tp+tp_0+fp+fp_0+fn+fn_0)
# 0.6
# And let's now compare this to the original calculation by hand in the question:
(tp + tn) / ((2*(tp + tn + fn + fp)) - (tp + tn))
# 0.6