将 lmer 函数应用于数据框 R 中的多个组和多个结果

Applying lmer function over several groups and several outcomes in data frame R

我有一个看起来有点像这样的数据框:

example <- tribble(
  ~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
  "1", "PRE", 1, 2, 3, "type1",
  "1", "POST", 2, 3, 4, "type1",
  "2", "PRE", 3, 4, 5, "type2",
  "2", "POST", 5, 6, 7, "type2",
  "3", "PRE", 6, 7, 8, "type3",
  "3", "POST", 9, 10, 11, "type3",
  "4", "PRE", 12, 12, 13, "type1",
  "4", "POST", 12, 13, 14, "type1",
  "5", "PRE", 13, 14, 15, "type2",
  "5", "POST", 15, 16, 17, "type2",
  "6", "PRE", 16, 17, 18, "type3",
  "6", "POST", 19, 10, 11, "type3"
)

我想 运行 对它进行以下函数,这样我就可以得到每个“类型”的线性模型,并且在每个类型中,每个结果的结果(“outcome1”、“outcome2” , "结果 3")

model_function <- function(outcome, df) {
  fit <- lmer(as.formula(paste0(outcome, "~ RANGE + (1|PERSON_ID)")), data = df)
  return(summary(fit))
}

这是我尝试过的方法,但出现以下错误:“错误:.f 的结果应该是一个数据框。”

outcomes <- purrr::set_names(names(example)[c(3,4,5)])

example %>% 
  dplyr::group_by(type) %>% 
  group_modify(~model_function(outcome = outcomes, df = .x))

非常感谢任何帮助...理想情况下,最终结果将是 table,如下所示:

结果 1 结果 2 结果 3
type1 系数; p值 系数; p值 系数; p值
type2 系数; p值 系数; p值 系数; p值
type3 系数; p值 系数; p值 系数; p值

假设您正在使用 library(lmerTest) 获取 P.values,这是一个利用 library(broom.mixed) 的潜在解决方案:

library(tidyverse)
library(lmerTest)
library(broom.mixed)

example <- tribble(
  ~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
  "1", "PRE", 1, 2, 3, "type1",
  "1", "POST", 2, 3, 4, "type1",
  "2", "PRE", 3, 4, 5, "type2",
  "2", "POST", 5, 6, 7, "type2",
  "3", "PRE", 6, 7, 8, "type3",
  "3", "POST", 9, 10, 11, "type3",
  "4", "PRE", 12, 12, 13, "type1",
  "4", "POST", 12, 13, 14, "type1",
  "5", "PRE", 13, 14, 15, "type2",
  "5", "POST", 15, 16, 17, "type2",
  "6", "PRE", 16, 17, 18, "type3",
  "6", "POST", 19, 10, 11, "type3"
)

example |> 
  nest(data = -type) |> 
  mutate(
    m1 = map(data, ~ lmer(outcome1 ~ RANGE + (1|PERSON_ID), data = .x)),
    m2 = map(data, ~ lmer(outcome2 ~ RANGE + (1|PERSON_ID), data = .x)),
    m3 = map(data, ~ lmer(outcome3 ~ RANGE + (1|PERSON_ID), data = .x)),
    across(m1:m3, map, ~ .x |> tidy() |> select(estimate, p.value) |> slice(2))
  ) |> 
  select(-data) |> 
  unnest(everything(), names_sep = "_")

#> # A tibble: 3 × 7
#>   type_type m1_estimate m1_p.value m2_estimate m2_p.value m3_estimate m3_p.value
#>   <chr>           <dbl>      <dbl>       <dbl>      <dbl>       <dbl>      <dbl>
#> 1 type1            -0.5      0.500       -1      3.68e-10       -1      8.83e- 1
#> 2 type2            -2        0.888       -2      9.01e- 1       -2      2.70e-10
#> 3 type3            -3        0.887        2.00   7.28e- 1        2.00   7.28e- 1

reprex package (v2.0.0)

于 2021-08-04 创建

这里有一个使用 model_function 函数的替代方法的选项,我在 group_modify() 中执行一个内部 map_dfc() 循环以循环遍历每个组的所有结果变量。

请注意,您的函数必须 return 一个 data.frame 才能 group_modify() 正常工作。我将其修改为 return a data.frame,其中仅包含来自固定效应差异检验的系数和 p 值。我包括一个模型的示例输出。

请注意,我使用 lmerTest 作为 p 值,使用 broom.mixed 来方便输出。但是,您也可以从汇总输出中手动提取系数和 p 值。

library(lme4)
library(lmerTest)
library(broom.mixed)
library(purrr)
library(dplyr)
example <- tibble::tribble(
    ~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
    "1", "PRE", 1, 2, 3, "type1",
    "1", "POST", 2, 3, 4, "type1",
    "2", "PRE", 3, 4, 5, "type2",
    "2", "POST", 5, 6, 7, "type2",
    "3", "PRE", 6, 7, 8, "type3",
    "3", "POST", 9, 10, 11, "type3",
    "4", "PRE", 12, 12, 13, "type1",
    "4", "POST", 12, 13, 14, "type1",
    "5", "PRE", 13, 14, 15, "type2",
    "5", "POST", 15, 16, 17, "type2",
    "6", "PRE", 16, 17, 18, "type3",
    "6", "POST", 19, 10, 11, "type3"
)

model_function <- function(outcome, df) {
    fit <- lmer(as.formula(paste0(outcome, "~ RANGE + (1|PERSON_ID)")), data = df)
    out = tidy(fit, effects = "fixed")[2, c("estimate", "p.value")]
    names(out) = paste(outcome, names(out), sep = "_")
    out
}

# Check what output looks like
model_function("outcome1", df = filter(example, type == "type1"))
#> # A tibble: 1 x 2
#>   outcome1_estimate outcome1_p.value
#>               <dbl>            <dbl>
#> 1              -0.5            0.500

# variables to loop through
outcomes <- purrr::set_names(names(example)[c(3,4,5)])

example %>% 
    dplyr::group_by(type) %>% 
    group_modify(~ {
        map_dfc(outcomes, function(outcome) model_function(outcome = outcome, df = .x))
    })
#> # A tibble: 3 x 7
#> # Groups:   type [3]
#>   type  outcome1_estimate outcome1_p.value outcome2_estimate outcome2_p.value
#>   <chr>             <dbl>            <dbl>             <dbl>            <dbl>
#> 1 type1              -0.5            0.500             -1      0.000000000368
#> 2 type2              -2.             0.888             -2      0.901         
#> 3 type3              -3.             0.887              2.00   0.728         
#> # ... with 2 more variables: outcome3_estimate <dbl>, outcome3_p.value <dbl>

reprex package (v2.0.0)

于 2021-08-05 创建