使用留一法在 R 中进行线性回归预测

Linear Regression prediction in R using Leave One out Approach

我有 3 个使用 mtcars 构建的线性回归模型,我想使用这些模型为 mtcars 表的每一行生成预测。这些预测应添加为 mtcars 数据框的附加列(3 个附加列),并且应使用留一法在 for 循环中生成。 此外,对 model1 和 model2 的预测应该通过“分组”气缸数来执行 而使用模型 3 进行的预测应该在不进行任何分组的情况下完成。

到目前为止,我已经能够通过循环中的单个模型获得一些东西:

model1 =lm(hp ~ mpg, data = mtcars)

model2 =lm(hp ~ mpg + hp, data = mtcars)

model3 =lm(hp ~ mpg + hp + wt, data = mtcars)

fitted_value <- NULL

for(i in 1:nrow(mtcars)){
  

  validation<-mtcars[i,]

  training<-mtcars[-i,]

  model1<-lm(mpg ~ hp, data = training)

  fitted_value[i] <-predict(model1, newdata = validation)

   }```


I would like to be able to generate all the model predictions by first putting all the models in a list or vector and attaching the result to the mtcars dataframe. Somthing lke thislike this:

```model1 =lm(hp ~ mpg, data = mtcars)

model2 =lm(hp ~ mpg + hp, data = mtcars)

model3 =lm(hp ~ mpg + hp + wt, data = mtcars)

models <- list(model1, model2, model3)

fitted_value <- NULL

for(i in 1:nrow(mtcars)){
  

  validation<-mtcars[i,]

  training<-mtcars[-i,]

  fitted_value[i] <-predict(models, newdata = validation)

   }```

Thank you for you help

您可以使用嵌套 map 来拟合每一行的三个公式中的每一个 i。然后 bind_colsmtcars 附上预测。

library(tidyverse)

frml_1 <- as.formula("hp ~ mpg")
frml_2 <- as.formula("hp ~ mpg + drat")
frml_3 <- as.formula("hp ~ mpg + drat + wt")
frmls <- list(frml_1 = frml_1, frml_2 = frml_2, frml_3 = frml_3)

mtcars %>%
  bind_cols(
    map(1:nrow(mtcars), function(i) {
      map_dfc(frmls, function(frml) {
        training <- mtcars[-i, ]
        fit <- lm(frml, data = training)
        
        validation <- mtcars[i, ]
        predict(fit, newdata = validation)
      })
    }) %>%
    bind_rows()
  )

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb    frml_1    frml_2    frml_3
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 138.65796 138.65796 140.61340
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 138.65796 138.65796 139.55056
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 122.76445 122.76445 124.91348
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 135.12607 135.12607 134.36670
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 158.96634 158.96634 158.85438
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 164.26418 164.26418 164.42112
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 197.81716 197.81716 199.74665
...

请注意,公式已从 RHS 中删除 hp,因为 hp 也是响应。为了演示目的,我使用了 drat

我可以通过执行以下脚本来完成:

fitted_value1 <- NULL
fitted_value2 <- NULL
fitted_value3 <- NULL

for(i in 1:nrow(mtcars)){
  validation<-mtcars[i,] 
  training<-mtcars[-i,]
  model1 =lm(hp ~ mpg, data = training)
  model2 =lm(hp ~ mpg + hp, data = training)
  model3 =lm(hp ~ mpg + hp + wt, data = training)
  fitted_value1[i] <-predict(model1, newdata = validation)
  fitted_value2[i] <-predict(model2, newdata = validation)
  fitted_value3[i] <-predict(model3, newdata = validation)
  res<- as.data.frame(cbind(mtcars,fitted_value1,fitted_value2,fitted_value3))
}

我该如何改进这段代码?我想将模型从循环中取出,将它们保存为一个列表,并且只引用循环内的列表。这或多或少是我理想中想要的(但它不起作用):

model1 =lm(hp ~ mpg, data = mtcars)
model2 =lm(hp ~ mpg + hp, data = mtcars)
model3 =lm(hp ~ mpg + hp + wt, data = mtcars)
models <- list(model1, model2, model3)

fitted_value <- NULL

for(i in 1:nrow(mtcars)){
  for (j in models){

    validation<-mtcars[i,]
    training<-mtcars[-i,]
    fitted_value[i] <-predict(models[j], newdata = validation)

    # this should save the predictions for all the models and append it to the original dataframe
    df <- cbind(mtcars,fitted_value) 
  }
}

感谢您的帮助