R:如何在 mboost 模型中指定预测变量
R: how to specify predictors in mboost model
我有以下数据集,其中包含 3 列协变量和 1 个结果列:
data <- structure(list(V1 = c(0.368203440103238, 0.324519532540959, -0.267369607029419,
-0.551350850969297, 0.12599748535452), V2 = c(-0.685091020879978,
0.0302665318913346, 0.38152909685676, -0.741473194305708, 1.01476858643759
), V3 = c(-1.11459785962843, -0.012932271762972, 2.02715929057818,
0.118419126609398, -1.01804828579617), y = c(-1.95083653823476,
-0.50091658480941, 3.74423248124182, -0.0459478421882341, -1.24653151600941
)), class = "data.frame", row.names = c("X1", "X2", "X3", "X4",
"X5"))
> head(data)
V1 V2 V3 y
X1 0.3682034 -0.68509102 -1.11459786 -1.95083654
X2 0.3245195 0.03026653 -0.01293227 -0.50091658
X3 -0.2673696 0.38152910 2.02715929 3.74423248
X4 -0.5513509 -0.74147319 0.11841913 -0.04594784
X5 0.1259975 1.01476859 -1.01804829 -1.24653152
我想拟合以下模型:
library(mboost)
model <- mboost(y ~ bols(V1, intercept = FALSE) +
bols(V2, intercept = FALSE) + bols(V3, intercept = FALSE),
data = data)
但是,为模型中的每一列输入 bols(covariate, intercept = FALSE)
非常繁琐。有没有办法针对任意数量的协变量自动执行此操作?例如,我目前有 3 个名为 V1, V2, V3
的协变量。但是,如果我有 10 个名为 V1-V10
怎么办?我想避免输入 10 bols()
个语句。
我们可以用 paste
创建公式表达式
fmla <- as.formula(paste0('y ~ ', paste0('bols(', setdiff(names(data),
'y'), ', intercept = FALSE)', collapse= " + ")))
model <- mboost(fmla, data = data)
model$call[[2]] <- fmla
model
# Model-based Boosting
#Call:
#mboost(formula = y ~ bols(V1, intercept = FALSE) + bols(V2, intercept = FALSE) + bols(V3, intercept = FALSE), data = data)
# Squared Error (Regression)
#Loss function: (y - f)^2
#Number of boosting iterations: mstop = 100
#Step size: 0.1
#Offset: 1.157408e-15
#Number of baselearners: 3
我有以下数据集,其中包含 3 列协变量和 1 个结果列:
data <- structure(list(V1 = c(0.368203440103238, 0.324519532540959, -0.267369607029419,
-0.551350850969297, 0.12599748535452), V2 = c(-0.685091020879978,
0.0302665318913346, 0.38152909685676, -0.741473194305708, 1.01476858643759
), V3 = c(-1.11459785962843, -0.012932271762972, 2.02715929057818,
0.118419126609398, -1.01804828579617), y = c(-1.95083653823476,
-0.50091658480941, 3.74423248124182, -0.0459478421882341, -1.24653151600941
)), class = "data.frame", row.names = c("X1", "X2", "X3", "X4",
"X5"))
> head(data)
V1 V2 V3 y
X1 0.3682034 -0.68509102 -1.11459786 -1.95083654
X2 0.3245195 0.03026653 -0.01293227 -0.50091658
X3 -0.2673696 0.38152910 2.02715929 3.74423248
X4 -0.5513509 -0.74147319 0.11841913 -0.04594784
X5 0.1259975 1.01476859 -1.01804829 -1.24653152
我想拟合以下模型:
library(mboost)
model <- mboost(y ~ bols(V1, intercept = FALSE) +
bols(V2, intercept = FALSE) + bols(V3, intercept = FALSE),
data = data)
但是,为模型中的每一列输入 bols(covariate, intercept = FALSE)
非常繁琐。有没有办法针对任意数量的协变量自动执行此操作?例如,我目前有 3 个名为 V1, V2, V3
的协变量。但是,如果我有 10 个名为 V1-V10
怎么办?我想避免输入 10 bols()
个语句。
我们可以用 paste
fmla <- as.formula(paste0('y ~ ', paste0('bols(', setdiff(names(data),
'y'), ', intercept = FALSE)', collapse= " + ")))
model <- mboost(fmla, data = data)
model$call[[2]] <- fmla
model
# Model-based Boosting
#Call:
#mboost(formula = y ~ bols(V1, intercept = FALSE) + bols(V2, intercept = FALSE) + bols(V3, intercept = FALSE), data = data)
# Squared Error (Regression)
#Loss function: (y - f)^2
#Number of boosting iterations: mstop = 100
#Step size: 0.1
#Offset: 1.157408e-15
#Number of baselearners: 3