使用循环 运行 使用 R 中不同数据集的回归?

Using a loop to run a regression using different datasets in R?

我有以下数据集:

n <- 2
strata <- rep(1:4, each=n)
y <- rnorm(n = 8)
x <- 1:8

df <- cbind.data.frame(y, x, strata)

我想使用循环执行以下过程

data_1 <- subset(df, strata == 1)
data_2 <- subset(df, strata == 2)
data_3 <- subset(df, strata == 3)
data_4 <- subset(df, strata == 4)

model1 <- lm(y ~ x, data = data_1)
model2 <- lm(y ~ x, data = data_2)
model3 <- lm(y ~ x, data = data_3)
model4 <- lm(y ~ x, data = data_4)

任何帮助将不胜感激,谢谢!

我们可以通过 'strata' 将数据 split 转换为 list 并通过使用 lapply[=16= 遍历 list 来创建模型]

out <- lapply(split(df, df$strata), function(dat) lm(y ~ x, data = dat))

-输出

$`1`

Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)            x  
     -2.907        1.924  


$`2`

Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)            x  
     2.5733      -0.7632  


$`3`

Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)            x  
     0.9309      -0.1986  


$`4`

Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)            x  
      8.479       -1.207  

尝试这样做

library(tidyverse)
library(broom)
mtcars %>% 
  group_nest(gear) %>% 
  mutate(model = map(data, ~lm(disp ~ mpg, data = .x)) %>% map(broom::glance)) %>% 
  unnest(model)
#> # A tibble: 3 x 14
#>    gear        data r.squared adj.r.squared sigma statistic p.value    df logLik
#>   <dbl> <list<tibb>     <dbl>         <dbl> <dbl>     <dbl>   <dbl> <dbl>  <dbl>
#> 1     3   [15 x 10]     0.526         0.489  67.8      14.4 2.23e-3     1  -83.5
#> 2     4   [12 x 10]     0.812         0.793  17.7      43.2 6.28e-5     1  -50.4
#> 3     5    [5 x 10]     0.775         0.701  63.2      10.4 4.86e-2     1  -26.5
#> # ... with 5 more variables: AIC <dbl>, BIC <dbl>, deviance <dbl>,
#> #   df.residual <int>, nobs <int>


mtcars %>% 
  group_nest(gear) %>% 
  mutate(model = map(data, ~lm(disp ~ mpg, data = .x)) %>% map(broom::tidy)) %>% 
  unnest(model)
#> # A tibble: 6 x 7
#>    gear                data term        estimate std.error statistic     p.value
#>   <dbl> <list<tibble[,10]>> <chr>          <dbl>     <dbl>     <dbl>       <dbl>
#> 1     3           [15 x 10] (Intercept)   655.       88.3       7.41 0.00000508 
#> 2     3           [15 x 10] mpg           -20.4       5.37     -3.80 0.00223    
#> 3     4           [12 x 10] (Intercept)   286.       25.3      11.3  0.000000514
#> 4     4           [12 x 10] mpg            -6.64      1.01     -6.57 0.0000628  
#> 5     5            [5 x 10] (Intercept)   529.      105.        5.02 0.0152     
#> 6     5            [5 x 10] mpg           -15.3       4.74     -3.22 0.0486

reprex package (v2.0.0)

创建于 2021-06-07