如何使用矢量化与 dplyr 中的 group_by 函数一起迭代列

Question

正如 Fitting several regression models with dplyr 所解释的，我们可以使用 broom 包中的 tidy 函数来运行跨组回归。例如，下面列出了 iris 数据集的演示代码，但是如果我们打算以同时的方式循环遍历多个列并且运行具有不同的回归因变量 (Sepal.Length,Sepal.Width,Petal.Length ) 与此 group_by 操作一起，我如何将 (s)apply 函数集成到这种情况下并获得结果这些回归模型(3*3=9)?

library(dplyr);library(broom)
res1=iris%>%
group_by(Species)%>%
do(res=lm(Sepal.Length~Petal.Width,data=.))
tidy(res1, res)%>%
filter(term!="(Intercept)")

Answer 1

您可以使用 lme4::lmList 和 broom.mixed::tidy 执行此操作。您也许可以将其调整为管道，但这应该可以帮助您入门。在这里，lmList 基本上执行与 dplyr 管道中的 group_by 相同的功能，但我更容易概念化如何使用 lapply 通过多个 DV 进行管道传输。祝你好运！！

library(lme4)
library(broom.mixed)

# Selecting DVs
dvs <- names(iris)[1:3]

# Making formula objects
formula_text <- paste0(dvs, "~ Petal.Width | Species")
formulas <- lapply(formula_text, formula)

# Running grouped analyses and looping through DVs
results <- lapply(formulas, function(x) {
  res <- broom.mixed::tidy(lmList(x, iris))
  res[res$terms != "(Intercept)",]
})

# Renaming and viewing results
names(results) <- formula_text

并且，查看结果：

results
$`Sepal.Length~ Petal.Width | Species`
# A tibble: 3 x 6
  group      terms       estimate   p.value std.error statistic
  <chr>      <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 setosa     Petal.Width    0.930 0.154         0.649      1.43
2 versicolor Petal.Width    1.43  0.0000629     0.346      4.12
3 virginica  Petal.Width    0.651 0.00993       0.249      2.61

$`Sepal.Width~ Petal.Width | Species`
# A tibble: 3 x 6
  group      terms       estimate    p.value std.error statistic
  <chr>      <chr>          <dbl>      <dbl>     <dbl>     <dbl>
1 setosa     Petal.Width    0.837 0.0415         0.407      2.06
2 versicolor Petal.Width    1.05  0.00000306     0.217      4.86
3 virginica  Petal.Width    0.631 0.0000855      0.156      4.04

$`Petal.Length~ Petal.Width | Species`
# A tibble: 3 x 6
  group      terms       estimate  p.value std.error statistic
  <chr>      <chr>          <dbl>    <dbl>     <dbl>     <dbl>
1 setosa     Petal.Width    0.546 2.67e- 1     0.490      1.12
2 versicolor Petal.Width    1.87  3.84e-11     0.261      7.16
3 virginica  Petal.Width    0.647 7.55e- 4     0.188      3.44

如何使用矢量化与 dplyr 中的 group_by 函数一起迭代列

How to iterate over columns with vectorization together with group_by function from dplyr

r

vectorization

dplyr