revoScaleR::rxGlm() R 中的问题 - GLM 残差

Question

我可能在这里找不到答案，因为我认为 revoScaleR 包没有被广泛使用。

如果我使用 rxGlm() 创建一个 GLM，它工作正常。然而，通过 rxPredict() 可用的模型残差似乎只是 "raw" 残差，即观察值减去拟合值。各种转换版本（偏差残差、皮尔逊残差等）似乎不可用。

有谁知道是否有办法实现这一点？我可以通过运行再次使用 glm() 获得模型的偏差残差（例如）（具有相同的公式、数据、错误结构，link 函数，权重）并使用 residuals(glm_object, type = "deviance")，但这很麻烦，因为 glm() 运行非常慢（大数据集，许多模型参数）。

谢谢。

编辑：包括我试图遵循的文献中的指导：

Answer 1

从您的问题中很难完全理解 RevoScaleR 包在残差方面提供了什么以及您究竟需要哪些残差。此外，关于残差的术语存在相当多的混淆，例如 here and here.

一些 points/observations 可能对您有所帮助。

在线性回归中，"raw" 等同于 "deviance" 残差

至少我从运行玩具回归中得到的 glm 和预测结果如下：

df <- mtcars
modl <- glm(formula = mpg ~ wt + qsec + am, data = mtcars)
y_hat <- predict(modl)

接下来，计算 "raw" 残差（预测结果减去实际结果）以及偏差残差：

y <- as.vector(df[["mpg"]])
res_raw <- y - y_hat
res_dev <- residuals(modl, type = "deviance")

这两个是相同的：

identical(res_raw, res_dev)
[1] TRUE

我想一旦你进入二元结果等就会更复杂。

计算标准化偏差残差的公式

标准化偏差残差是使用 rstandard 方法从 glm 计算得出的。

res_std <- rstandard(modl)

查看 getAnywhere(rstandard.glm) 告诉您如何根据偏差残差手动计算标准化残差：

function (model, infl = influence(model, do.coef = FALSE), type = c("deviance", 
    "pearson"), ...) 
{
    type <- match.arg(type)
    res <- switch(type, pearson = infl$pear.res, infl$dev.res)
    res <- res/sqrt(summary(model)$dispersion * (1 - infl$hat)) # this is the key line
    res[is.infinite(res)] <- NaN
    res
}

所以在我的示例中，您将通过运行 res/sqrt(summary(modl)$dispersion * (1 - influence(modl)$hat)) 手动计算标准化残差。所以你需要两个东西：hat 和 dispersion。我假设 RevoScaleR 提供色散参数。如果 RevoScaleR 中没有像 influence(modl)$hat 那样的东西来获取帽子值，您将不得不从头开始：

X <- as.matrix(df[, c("wt", "qsec", "am")]) # Gets the X variables
X <- cbind(rep(1, nrow(df)), X) # adds column for the constant
hat <- diag(X %*% solve(t(X) %*% X) %*% t(X)) # formula for hat values

现在计算您的标准化偏差残差：

res_man <- res_raw/sqrt(summary(modl)$dispersion * (1 - hat))

与rstandard派生相同：

head(res_man)
        Mazda RX4     Mazda RX4 Wag        Datsun 710    Hornet 4 Drive Hornet Sportabout           Valiant 
       -0.6254171        -0.4941877        -1.4885771         0.2297471         0.7217423        -1.1790097 
head(res_std)
        Mazda RX4     Mazda RX4 Wag        Datsun 710    Hornet 4 Drive Hornet Sportabout           Valiant 
       -0.6254171        -0.4941877        -1.4885771         0.2297471         0.7217423        -1.1790097

revoScaleR::rxGlm() R 中的问题 - GLM 残差

revoScaleR::rxGlm() Question in R - GLM Residuals

r

glm

revoscaler

在线性回归中，"raw" 等同于 "deviance" 残差

计算标准化偏差残差的公式