来自 glmer 的反向变换系数与用于预测的缩放自变量
Back-transform coefficients from glmer with scaled independent variables for prediction
我使用 lme4
包安装了一个混合模型。在拟合模型之前,我使用 scale()
函数转换了自变量。我现在想使用 predict()
在图表上显示我的结果,因此我需要将预测数据恢复到原始比例。我该怎么做?
简化示例:
database <- mtcars
# Scale data
database$wt <- scale(mtcars$wt)
database$am <- scale(mtcars$am)
# Make model
model.1 <- glmer(vs ~ scale(wt) + scale(am) + (1|carb), database, family = binomial, na.action = "na.fail")
# make new data frame with all values set to their mean
xweight <- as.data.frame(lapply(lapply(database[, -1], mean), rep, 100))
# make new values for wt
xweight$wt <- (wt = seq(min(database$wt), max(database$wt), length = 100))
# predict from new values
a <- predict(model.1, newdata = xweight, type="response", re.form=NA)
# returns scaled prediction
我已经尝试使用 this example 来反向转换预测:
# save scale and center values
scaleList <- list(scale = attr(database$wt, "scaled:scale"),
center = attr(database$wt, "scaled:center"))
# back-transform predictions
a.unscaled <- a * scaleList$scale + scaleList$center
# Make model with unscaled data to compare
un.model.1 <- glmer(vs ~ wt + am + (1|carb), mtcars, family = binomial, na.action = "na.fail")
# make new data frame with all values set to their mean
un.xweight <- as.data.frame(lapply(lapply(mtcars[, -1], mean), rep, 100))
# make new values for wt
un.xweight$wt <- (wt = seq(min(mtcars$wt), max(mtcars$wt), length = 100))
# predict from new values
b <- predict(un.model.1, newdata = xweight, type="response", re.form=NA)
all.equal(a.unscaled,b)
# [1] "Mean relative difference: 0.7223061"
这不起作用 - 应该没有任何区别。 我做错了什么?
我也看过一些类似的问题,但没有设法将任何问题应用到我的案例中 (How to unscale the coefficients from an lmer()-model fitted with a scaled response, unscale and uncenter glmer parameters, Scale back linear regression coefficients in R from scaled and centered data, https://stats.stackexchange.com/questions/302448/back-transform-mixed-effects-models-regression-coefficients-for-fixed-effects-f)。
您的方法的问题在于它仅 "unscales" 基于 wt
变量,而您缩放了回归模型中的所有变量。一种有效的方法是使用原始数据框上使用的 centering/scaling 值调整新(预测)数据框中的所有变量:
## scale variable x using center/scale attributes
## of variable y
scfun <- function(x,y) {
scale(x,
center=attr(y,"scaled:center"),
scale=attr(y,"scaled:scale"))
}
## scale prediction frame
xweight_sc <- transform(xweight,
wt = scfun(wt, database$wt),
am = scfun(am, database$am))
## predict
p_unsc <- predict(model.1,
newdata=xweight_sc,
type="response", re.form=NA)
将此 p_unsc
与未缩放模型(代码中的 b
)的预测进行比较,即 all.equal(b,p_unsc)
,得出 TRUE。
另一种合理的方法是
- unscale/uncenter 使用链接问题之一(例如 this one)中提出的 "unscaling" 方法的所有参数,生成系数向量
beta_unsc
- 根据您的预测框架构建合适的模型矩阵:
X <- model.matrix(formula(model,fixed.only=TRUE),
newdata=pred_frame)
- 计算线性预测器和反变换:
pred <- plogis(X %*% beta_unsc)
我使用 lme4
包安装了一个混合模型。在拟合模型之前,我使用 scale()
函数转换了自变量。我现在想使用 predict()
在图表上显示我的结果,因此我需要将预测数据恢复到原始比例。我该怎么做?
简化示例:
database <- mtcars
# Scale data
database$wt <- scale(mtcars$wt)
database$am <- scale(mtcars$am)
# Make model
model.1 <- glmer(vs ~ scale(wt) + scale(am) + (1|carb), database, family = binomial, na.action = "na.fail")
# make new data frame with all values set to their mean
xweight <- as.data.frame(lapply(lapply(database[, -1], mean), rep, 100))
# make new values for wt
xweight$wt <- (wt = seq(min(database$wt), max(database$wt), length = 100))
# predict from new values
a <- predict(model.1, newdata = xweight, type="response", re.form=NA)
# returns scaled prediction
我已经尝试使用 this example 来反向转换预测:
# save scale and center values
scaleList <- list(scale = attr(database$wt, "scaled:scale"),
center = attr(database$wt, "scaled:center"))
# back-transform predictions
a.unscaled <- a * scaleList$scale + scaleList$center
# Make model with unscaled data to compare
un.model.1 <- glmer(vs ~ wt + am + (1|carb), mtcars, family = binomial, na.action = "na.fail")
# make new data frame with all values set to their mean
un.xweight <- as.data.frame(lapply(lapply(mtcars[, -1], mean), rep, 100))
# make new values for wt
un.xweight$wt <- (wt = seq(min(mtcars$wt), max(mtcars$wt), length = 100))
# predict from new values
b <- predict(un.model.1, newdata = xweight, type="response", re.form=NA)
all.equal(a.unscaled,b)
# [1] "Mean relative difference: 0.7223061"
这不起作用 - 应该没有任何区别。 我做错了什么?
我也看过一些类似的问题,但没有设法将任何问题应用到我的案例中 (How to unscale the coefficients from an lmer()-model fitted with a scaled response, unscale and uncenter glmer parameters, Scale back linear regression coefficients in R from scaled and centered data, https://stats.stackexchange.com/questions/302448/back-transform-mixed-effects-models-regression-coefficients-for-fixed-effects-f)。
您的方法的问题在于它仅 "unscales" 基于 wt
变量,而您缩放了回归模型中的所有变量。一种有效的方法是使用原始数据框上使用的 centering/scaling 值调整新(预测)数据框中的所有变量:
## scale variable x using center/scale attributes
## of variable y
scfun <- function(x,y) {
scale(x,
center=attr(y,"scaled:center"),
scale=attr(y,"scaled:scale"))
}
## scale prediction frame
xweight_sc <- transform(xweight,
wt = scfun(wt, database$wt),
am = scfun(am, database$am))
## predict
p_unsc <- predict(model.1,
newdata=xweight_sc,
type="response", re.form=NA)
将此 p_unsc
与未缩放模型(代码中的 b
)的预测进行比较,即 all.equal(b,p_unsc)
,得出 TRUE。
另一种合理的方法是
- unscale/uncenter 使用链接问题之一(例如 this one)中提出的 "unscaling" 方法的所有参数,生成系数向量
beta_unsc
- 根据您的预测框架构建合适的模型矩阵:
X <- model.matrix(formula(model,fixed.only=TRUE),
newdata=pred_frame)
- 计算线性预测器和反变换:
pred <- plogis(X %*% beta_unsc)