如何向量化精度度量计算？

Question

我正在为某些给定的真值标签编写自己的准确度函数（正确预测数/总预测数），例如[0, 1, 1, ...] 和概率，例如[[0.8, 0.2], [0.3, 0.7], [0.1, 0.9] ...]。我不想使用库函数，例如 sklearn 的 accuracy_score().

我使用 for 循环创建了这个版本：

def compute_accuracy(truth_labels, probs):
    total = 0
    total_correct = 0
    for index, prob in enumerate(probs):
        predicted_label = 0 if prob[0] > 0.5 else 1
        if predicted_label == truth_labels[index]:
            total_correct += 1
        total += 1
    if total:
        return total_correct / total
    else:
        return -1

我现在希望通过对其进行矢量化来提高效率。我的目标是检查概率 > 0.5 是否与真实标签匹配：

import numpy as np

def compute_accuracy(truth_labels, probs):
    return ((np.array(probs[:][value_of_truth_labels_at_same_index]) > 0.5).astype(int) == np.array(truth_labels)).mean()

此时我不确定如何在不返回 for 循环的情况下退出 value_of_truth_labels_at_same_index。

Answer 1

import numpy as np
N = 10
X = np.random.randint(0,2,(N,))
p = np.random.random((N,2))
acc = np.mean(np.argmax(p, axis=1) == X)*100
print(f'Accuracy: {acc}%')

如何向量化精度度量计算？

How do I vectorise an accuracy metric computation?

python

performance

vectorization