xgboost 的 github 存储库中的自定义 objective 函数示例脚本中如何计算对数损失的梯度和 hessian？

Question

我想了解如何在 xgboost sample script 中计算 logloss 函数的梯度和 hessian。

我简化了获取 numpy 数组的函数，并生成了 y_hat 和 y_true，它们是脚本中使用的值的示例。

这是一个简化的例子：

import numpy as np


def loglikelihoodloss(y_hat, y_true):
    prob = 1.0 / (1.0 + np.exp(-y_hat))
    grad = prob - y_true
    hess = prob * (1.0 - prob)
    return grad, hess

y_hat = np.array([1.80087972, -1.82414818, -1.82414818,  1.80087972, -2.08465433,
                  -1.82414818, -1.82414818,  1.80087972, -1.82414818, -1.82414818])
y_true = np.array([1.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.])

loglikelihoodloss(y_hat, y_true)

对数损失函数是 where .

的总和

然后梯度（相对于 p）为 however in the code its 。

同样，二阶导数（关于 p）是 however in the code it is 。

等式如何相等？

Answer 1

对数损失函数给出为：

哪里

取偏导数得到梯度为

因此我们得到负梯度为p-y。

可以进行类似的计算得到hessian。

xgboost 的 github 存储库中的自定义 objective 函数示例脚本中如何计算对数损失的梯度和 hessian？

How is the gradient and hessian of logarithmic loss computed in the custom objective function example script in xgboost's github repository?

numpy

machine-learning

entropy

derivative

xgboost