评估一个 class SVM 的性能

Question

我一直在尝试评估我的 one-class SVM 的性能。我尝试使用 scikit-learn 绘制 ROC 曲线，结果有点奇怪。

X_train, X_test = train_test_split(compressed_dataset,test_size = 0.5,random_state = 42)

clf = OneClassSVM(nu=0.1,kernel = "rbf", gamma =0.1)
y_score = clf.fit(X_train).decision_function(X_test)

pred = clf.predict(X_train)

fpr,tpr,thresholds = roc_curve(pred,y_score)

#绘制roc曲线

plt.figure()
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend(loc="lower right")
plt.show()

我得到的ROC曲线：

有人可以帮我解决这个问题吗？

Answer 1

这个剧情有什么离奇之处？您固定了一组 nu 和 gamma，因此您的模型既不会过拟合也不会欠拟合。移动阈值（这是一个 ROC 变量）不会导致 100% TPR。尝试高 gamma 和非常小的 nu（训练误差的上限），你会得到更多 "typical" 图。

Answer 2

在我看来，得到分数：

pred_scores = clf.score_samples(X_train)

然后，pred_scores需要在min-max normalize

之前进行min-max normalized

评估一个 class SVM 的性能

Evaluating the performance of one class SVM

svm

scikit-learn