将 lmer 函数应用于数据框 R 中的多个组和多个结果
Applying lmer function over several groups and several outcomes in data frame R
我有一个看起来有点像这样的数据框:
example <- tribble(
~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
"1", "PRE", 1, 2, 3, "type1",
"1", "POST", 2, 3, 4, "type1",
"2", "PRE", 3, 4, 5, "type2",
"2", "POST", 5, 6, 7, "type2",
"3", "PRE", 6, 7, 8, "type3",
"3", "POST", 9, 10, 11, "type3",
"4", "PRE", 12, 12, 13, "type1",
"4", "POST", 12, 13, 14, "type1",
"5", "PRE", 13, 14, 15, "type2",
"5", "POST", 15, 16, 17, "type2",
"6", "PRE", 16, 17, 18, "type3",
"6", "POST", 19, 10, 11, "type3"
)
我想 运行 对它进行以下函数,这样我就可以得到每个“类型”的线性模型,并且在每个类型中,每个结果的结果(“outcome1”、“outcome2” , "结果 3")
model_function <- function(outcome, df) {
fit <- lmer(as.formula(paste0(outcome, "~ RANGE + (1|PERSON_ID)")), data = df)
return(summary(fit))
}
这是我尝试过的方法,但出现以下错误:“错误:.f 的结果应该是一个数据框。”
outcomes <- purrr::set_names(names(example)[c(3,4,5)])
example %>%
dplyr::group_by(type) %>%
group_modify(~model_function(outcome = outcomes, df = .x))
非常感谢任何帮助...理想情况下,最终结果将是 table,如下所示:
结果 1
结果 2
结果 3
type1
系数; p值
系数; p值
系数; p值
type2
系数; p值
系数; p值
系数; p值
type3
系数; p值
系数; p值
系数; p值
假设您正在使用 library(lmerTest)
获取 P.values,这是一个利用 library(broom.mixed)
的潜在解决方案:
library(tidyverse)
library(lmerTest)
library(broom.mixed)
example <- tribble(
~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
"1", "PRE", 1, 2, 3, "type1",
"1", "POST", 2, 3, 4, "type1",
"2", "PRE", 3, 4, 5, "type2",
"2", "POST", 5, 6, 7, "type2",
"3", "PRE", 6, 7, 8, "type3",
"3", "POST", 9, 10, 11, "type3",
"4", "PRE", 12, 12, 13, "type1",
"4", "POST", 12, 13, 14, "type1",
"5", "PRE", 13, 14, 15, "type2",
"5", "POST", 15, 16, 17, "type2",
"6", "PRE", 16, 17, 18, "type3",
"6", "POST", 19, 10, 11, "type3"
)
example |>
nest(data = -type) |>
mutate(
m1 = map(data, ~ lmer(outcome1 ~ RANGE + (1|PERSON_ID), data = .x)),
m2 = map(data, ~ lmer(outcome2 ~ RANGE + (1|PERSON_ID), data = .x)),
m3 = map(data, ~ lmer(outcome3 ~ RANGE + (1|PERSON_ID), data = .x)),
across(m1:m3, map, ~ .x |> tidy() |> select(estimate, p.value) |> slice(2))
) |>
select(-data) |>
unnest(everything(), names_sep = "_")
#> # A tibble: 3 × 7
#> type_type m1_estimate m1_p.value m2_estimate m2_p.value m3_estimate m3_p.value
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 type1 -0.5 0.500 -1 3.68e-10 -1 8.83e- 1
#> 2 type2 -2 0.888 -2 9.01e- 1 -2 2.70e-10
#> 3 type3 -3 0.887 2.00 7.28e- 1 2.00 7.28e- 1
由 reprex package (v2.0.0)
于 2021-08-04 创建
这里有一个使用 model_function
函数的替代方法的选项,我在 group_modify()
中执行一个内部 map_dfc()
循环以循环遍历每个组的所有结果变量。
请注意,您的函数必须 return 一个 data.frame 才能 group_modify()
正常工作。我将其修改为 return a data.frame,其中仅包含来自固定效应差异检验的系数和 p 值。我包括一个模型的示例输出。
请注意,我使用 lmerTest 作为 p 值,使用 broom.mixed 来方便输出。但是,您也可以从汇总输出中手动提取系数和 p 值。
library(lme4)
library(lmerTest)
library(broom.mixed)
library(purrr)
library(dplyr)
example <- tibble::tribble(
~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
"1", "PRE", 1, 2, 3, "type1",
"1", "POST", 2, 3, 4, "type1",
"2", "PRE", 3, 4, 5, "type2",
"2", "POST", 5, 6, 7, "type2",
"3", "PRE", 6, 7, 8, "type3",
"3", "POST", 9, 10, 11, "type3",
"4", "PRE", 12, 12, 13, "type1",
"4", "POST", 12, 13, 14, "type1",
"5", "PRE", 13, 14, 15, "type2",
"5", "POST", 15, 16, 17, "type2",
"6", "PRE", 16, 17, 18, "type3",
"6", "POST", 19, 10, 11, "type3"
)
model_function <- function(outcome, df) {
fit <- lmer(as.formula(paste0(outcome, "~ RANGE + (1|PERSON_ID)")), data = df)
out = tidy(fit, effects = "fixed")[2, c("estimate", "p.value")]
names(out) = paste(outcome, names(out), sep = "_")
out
}
# Check what output looks like
model_function("outcome1", df = filter(example, type == "type1"))
#> # A tibble: 1 x 2
#> outcome1_estimate outcome1_p.value
#> <dbl> <dbl>
#> 1 -0.5 0.500
# variables to loop through
outcomes <- purrr::set_names(names(example)[c(3,4,5)])
example %>%
dplyr::group_by(type) %>%
group_modify(~ {
map_dfc(outcomes, function(outcome) model_function(outcome = outcome, df = .x))
})
#> # A tibble: 3 x 7
#> # Groups: type [3]
#> type outcome1_estimate outcome1_p.value outcome2_estimate outcome2_p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 type1 -0.5 0.500 -1 0.000000000368
#> 2 type2 -2. 0.888 -2 0.901
#> 3 type3 -3. 0.887 2.00 0.728
#> # ... with 2 more variables: outcome3_estimate <dbl>, outcome3_p.value <dbl>
由 reprex package (v2.0.0)
于 2021-08-05 创建
我有一个看起来有点像这样的数据框:
example <- tribble(
~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
"1", "PRE", 1, 2, 3, "type1",
"1", "POST", 2, 3, 4, "type1",
"2", "PRE", 3, 4, 5, "type2",
"2", "POST", 5, 6, 7, "type2",
"3", "PRE", 6, 7, 8, "type3",
"3", "POST", 9, 10, 11, "type3",
"4", "PRE", 12, 12, 13, "type1",
"4", "POST", 12, 13, 14, "type1",
"5", "PRE", 13, 14, 15, "type2",
"5", "POST", 15, 16, 17, "type2",
"6", "PRE", 16, 17, 18, "type3",
"6", "POST", 19, 10, 11, "type3"
)
我想 运行 对它进行以下函数,这样我就可以得到每个“类型”的线性模型,并且在每个类型中,每个结果的结果(“outcome1”、“outcome2” , "结果 3")
model_function <- function(outcome, df) {
fit <- lmer(as.formula(paste0(outcome, "~ RANGE + (1|PERSON_ID)")), data = df)
return(summary(fit))
}
这是我尝试过的方法,但出现以下错误:“错误:.f 的结果应该是一个数据框。”
outcomes <- purrr::set_names(names(example)[c(3,4,5)])
example %>%
dplyr::group_by(type) %>%
group_modify(~model_function(outcome = outcomes, df = .x))
非常感谢任何帮助...理想情况下,最终结果将是 table,如下所示:
结果 1 | 结果 2 | 结果 3 | |
---|---|---|---|
type1 | 系数; p值 | 系数; p值 | 系数; p值 |
type2 | 系数; p值 | 系数; p值 | 系数; p值 |
type3 | 系数; p值 | 系数; p值 | 系数; p值 |
假设您正在使用 library(lmerTest)
获取 P.values,这是一个利用 library(broom.mixed)
的潜在解决方案:
library(tidyverse)
library(lmerTest)
library(broom.mixed)
example <- tribble(
~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
"1", "PRE", 1, 2, 3, "type1",
"1", "POST", 2, 3, 4, "type1",
"2", "PRE", 3, 4, 5, "type2",
"2", "POST", 5, 6, 7, "type2",
"3", "PRE", 6, 7, 8, "type3",
"3", "POST", 9, 10, 11, "type3",
"4", "PRE", 12, 12, 13, "type1",
"4", "POST", 12, 13, 14, "type1",
"5", "PRE", 13, 14, 15, "type2",
"5", "POST", 15, 16, 17, "type2",
"6", "PRE", 16, 17, 18, "type3",
"6", "POST", 19, 10, 11, "type3"
)
example |>
nest(data = -type) |>
mutate(
m1 = map(data, ~ lmer(outcome1 ~ RANGE + (1|PERSON_ID), data = .x)),
m2 = map(data, ~ lmer(outcome2 ~ RANGE + (1|PERSON_ID), data = .x)),
m3 = map(data, ~ lmer(outcome3 ~ RANGE + (1|PERSON_ID), data = .x)),
across(m1:m3, map, ~ .x |> tidy() |> select(estimate, p.value) |> slice(2))
) |>
select(-data) |>
unnest(everything(), names_sep = "_")
#> # A tibble: 3 × 7
#> type_type m1_estimate m1_p.value m2_estimate m2_p.value m3_estimate m3_p.value
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 type1 -0.5 0.500 -1 3.68e-10 -1 8.83e- 1
#> 2 type2 -2 0.888 -2 9.01e- 1 -2 2.70e-10
#> 3 type3 -3 0.887 2.00 7.28e- 1 2.00 7.28e- 1
由 reprex package (v2.0.0)
于 2021-08-04 创建这里有一个使用 model_function
函数的替代方法的选项,我在 group_modify()
中执行一个内部 map_dfc()
循环以循环遍历每个组的所有结果变量。
请注意,您的函数必须 return 一个 data.frame 才能 group_modify()
正常工作。我将其修改为 return a data.frame,其中仅包含来自固定效应差异检验的系数和 p 值。我包括一个模型的示例输出。
请注意,我使用 lmerTest 作为 p 值,使用 broom.mixed 来方便输出。但是,您也可以从汇总输出中手动提取系数和 p 值。
library(lme4)
library(lmerTest)
library(broom.mixed)
library(purrr)
library(dplyr)
example <- tibble::tribble(
~PERSON_ID, ~RANGE, ~outcome1, ~outcome2, ~outcome3, ~type,
"1", "PRE", 1, 2, 3, "type1",
"1", "POST", 2, 3, 4, "type1",
"2", "PRE", 3, 4, 5, "type2",
"2", "POST", 5, 6, 7, "type2",
"3", "PRE", 6, 7, 8, "type3",
"3", "POST", 9, 10, 11, "type3",
"4", "PRE", 12, 12, 13, "type1",
"4", "POST", 12, 13, 14, "type1",
"5", "PRE", 13, 14, 15, "type2",
"5", "POST", 15, 16, 17, "type2",
"6", "PRE", 16, 17, 18, "type3",
"6", "POST", 19, 10, 11, "type3"
)
model_function <- function(outcome, df) {
fit <- lmer(as.formula(paste0(outcome, "~ RANGE + (1|PERSON_ID)")), data = df)
out = tidy(fit, effects = "fixed")[2, c("estimate", "p.value")]
names(out) = paste(outcome, names(out), sep = "_")
out
}
# Check what output looks like
model_function("outcome1", df = filter(example, type == "type1"))
#> # A tibble: 1 x 2
#> outcome1_estimate outcome1_p.value
#> <dbl> <dbl>
#> 1 -0.5 0.500
# variables to loop through
outcomes <- purrr::set_names(names(example)[c(3,4,5)])
example %>%
dplyr::group_by(type) %>%
group_modify(~ {
map_dfc(outcomes, function(outcome) model_function(outcome = outcome, df = .x))
})
#> # A tibble: 3 x 7
#> # Groups: type [3]
#> type outcome1_estimate outcome1_p.value outcome2_estimate outcome2_p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 type1 -0.5 0.500 -1 0.000000000368
#> 2 type2 -2. 0.888 -2 0.901
#> 3 type3 -3. 0.887 2.00 0.728
#> # ... with 2 more variables: outcome3_estimate <dbl>, outcome3_p.value <dbl>
由 reprex package (v2.0.0)
于 2021-08-05 创建