记分员无法识别输入

Scorer not recognizing inputs

我正在尝试使用带有以下代码的自定义记分器

def edge_score(y, y_pred):
    y_pred.name = 'y_pred'
    y.name = 'y'

    df = pd.concat([y_pred, y])

    df['sign_pred'] = df.y_pred.apply(np.sign)
    df['sign_true'] = df.y.apply(np.sign)
    df['is_correct'] = 0
    df.loc[
        df.sign_pred * df.sign_true > 0, 'is_correct'] = 1
    df['is_incorrect'] = 0
    df.loc[
        df.sign_pred * df.sign_true < 0, 'is_incorrect'] = 1
    df['is_predicted'] = df.is_correct + df.is_incorrect
    df['result'] = df.sign_pred * df.y
    df['edge'] = df.result.mean()
    output_errors = df[['edge']]
    output_errors.to_numpy()

    return np.average(output_errors)
edge = make_scorer(edge_score)

我收到以下错误

AttributeError: 'numpy.ndarray' object has no attribute 'name'

当我注释掉 .name 行时,出现以下错误

TypeError: cannot concatenate object of type '<class 'numpy.ndarray'>'; only Series and DataFrame objs are valid

当我将 true 和 predictions 转换为数据帧时,出现以下错误

y_pred = pd.DataFrame(y_pred)
y = pd.DataFrame(y)
AttributeError: 'DataFrame' object has no attribute 'y_pred'

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html#sklearn.metrics.make_scorer

更改这些代码行

    df['sign_pred'] = df.y_pred.apply(np.sign)
    df['sign_true'] = df.y.apply(np.sign)

这些:

    df['sign_pred'] = np.sign(y_pred)
    df['sign_true'] = np.sign(y)

你应该首先用yy_pred两个numpy数组创建一个DataFrame,然后执行所有操作。

def edge_score(y, y_pred):
    
    df = pd.DataFrame({"y":y,
                       "y_pred":y_pred})

    df['sign_pred'] = df.y_pred.apply(np.sign)
    df['sign_true'] = df.y.apply(np.sign)
    df['is_correct'] = 0
    df.loc[
        df.sign_pred * df.sign_true > 0, 'is_correct'] = 1
    df['is_incorrect'] = 0
    df.loc[
        df.sign_pred * df.sign_true < 0, 'is_incorrect'] = 1
    df['is_predicted'] = df.is_correct + df.is_incorrect
    df['result'] = df.sign_pred * df.y
    df['edge'] = df.result.mean()
    output_errors = df[['edge']]
    output_errors.to_numpy()

    return np.average(output_errors)

edge = make_scorer(edge_score)
def custom_score(y_true, y_pred):
  true_sign = np.sign(y_true)
  pred_sign = np.sign(y_pred)
  true_vs_pred = np.where(true_sign == pred_sign, 1, 0)
  true_pred = (true_vs_pred == 1).sum()
  return true_pred
custom_scorer = make_scorer(custom_score, greater_is_better=True)

将所有内容转换为数组,然后对其进行处理。