如何定义自己的评分策略 sklearn.model_selection.GridSearchCV？

Question

我想在 GridSearchCV 中定义一个新的评分，正如这里所说 http://scikit-learn.org/stable/modules/model_evaluation.html#implementing-your-own-scoring-object 。这是我的代码：

from sklearn.model_selection import GridSearchCV
def pe_score(estimator,x,y):
    clf=estimator
    clf.fit(x,y)
    z=clf.predict(x)
    pe=prob_error(z, y)
    return pe

pe_error=pe_score(SVC(),xTrain,yTrain)
grid = GridSearchCV(SVC(), param_grid={'kernel':('linear', 'rbf'), 'C':[1, 10, 100,1000,10000]}, scoring=pe_error)

其中 prob_error(z,y) 是计算我想要最小化的误差的函数，即 z 训练集的预测和 y 训练集的真实值。但是，我收到以下错误：

---> 18 clf.fit(xTrain, yTrain)
TypeError: 'numpy.float64' object is not callable

不知道pe_error的格式是否定义好。我该如何解决？谢谢。

Answer 1

评分函数的格式应为 score_func(y, y_pred, **kwargs)

然后您可以使用 make_scorer 函数获取您的评分函数并使其与 GridSearchCV 一起使用。

因此，在这种情况下它将是：

from sklearn.model_selection import GridSearchCV
from sklearn.metrics import make_scorer

clf = estimator
clf.fit(x,y)
z = clf.predict(x)

def pe_score(y, y_pred):
    pe = prob_error(y_pred, y)
    return pe

pe_error = make_scorer(pe_score)
grid = GridSearchCV(SVC(), param_grid={'kernel':('linear', 'rbf'), 'C':[1, 10, 100,1000,10000]}, scoring= pe_error)

（我假设您已经 prob_error 在您的代码中的其他地方实施或导入）

文档：http://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html

如何定义自己的评分策略 sklearn.model_selection.GridSearchCV？

How can I define my own scoring strategy sklearn.model_selection.GridSearchCV?

python

machine-learning

python-3.x

scikit-learn

grid-search