将 ifelse() 创建的变量添加到数据框列表

Add ifelse() created variable to list of data frames

我想知道如何使用 ifelse() 命令为数据帧列表创建变量。

我使用 ggplot2::diamonds 数据集中的 300 个最高值和最低值创建了 2 个数据集,分别称为 diamonds_top300diamonds_bottom300:

# Loads packages
# ---- NOTE: making plots and diamonds dataset
if(!require(ggplot2)){install.packages("ggplot2")}
# ---- NOTE: for data wrangling
if(!require(dplyr)){install.packages("dplyr")}

# dataset creation
# ---- NOTE: selects only the top 300 rows of the dataset
diamonds_top300 <- data.frame(dplyr::top_n(diamonds, 300, table))
# ---- NOTE: selects only the bottom 300 rows of the dataset
diamonds_bottom300 <- data.frame(dplyr::top_n(diamonds, -300, table))

然后我使用 lapply 和函数创建了一个包含 2 个模型的列表,仅在使用的数据集上有所不同:

# Loads packages
# ---- NOTE: run mixed effects models
if(!require(lme4)){install.packages("lme4")}

## lists datasets to use
DATASET_list <- c("diamonds_top300", "diamonds_bottom300")

## creates model
# ---- NOTE: creates list object
freq_mlm_poisson_model <- 
  lapply(DATASET_list,
         function(data_list) wrapr::let(
           c(data_list_model = data_list), 
           (lme4::glmer(
             price ~ cut + color + carat + (1 | clarity) + (1 | depth),
             data = data_list_model,
             family = poisson()
           )
           )
         )
  )
# ---- NOTE: changes list object name
freq_mlm_poisson_model <- 
  setNames(freq_mlm_poisson_model, paste("freq_mlm_poisson_model", 
                                         DATASET_list,
                                              sep = "__")
  )

然后我使用 lapply 和函数以列表形式创建这些模型的摘要:

### creates summary model for list object freq_mlm_poisson_model
# ---- NOTE: creates list object
freq_mlm_poisson_summary <- 
  lapply(
    freq_mlm_poisson_model, 
    function(model_list) {
      summary(model_list)
    }
  )
# ---- NOTE: changes list object name
freq_mlm_poisson_summary <- 
  setNames(freq_mlm_poisson_summary, paste("freq_mlm_poisson_summary", 
                                           DATASET_list,
                                         sep = "__")
  )

然后我将摘要转换为仅包含固定效应信息的数据框:

### turns summary list fixed effects into list of data frames
# ---- NOTE: creates object with summary of fixed effects
freq_mlm_poisson_summary_fixedeffects <- 
  lapply(freq_mlm_poisson_summary, `[[`, 10)
# ---- NOTE: creates list object
freq_mlm_poisson_summary_fixedeffects_df <- 
  lapply(
    freq_mlm_poisson_summary_fixedeffects, 
    function(model_list) {
      data.frame(model_list)
    }
  )
# ---- NOTE: changes list object name
freq_mlm_poisson_summary_fixedeffects_df <- 
  setNames(freq_mlm_poisson_summary_fixedeffects_df, paste("freq_mlm_poisson_summary_fixedeffects_df", 
                                           DATASET_list,
                                           sep = "__")
  )

有没有办法使用 ifelse() 命令或其他命令在每个列表中创建名为 p_value_sign 的变量,它告诉 yes/no 变量中相应的 p 值是否Pr(>|z|) 小于 0.05(即 p < 0.05)?

因为它是data.framelist,我们可以在用lapply遍历list之后直接用transform创建列。最好将其保留为逻辑向量,因为它有助于更​​轻松地进行子集

freq_mlm_poisson_summary_fixedeffects_df2 <- lapply(
       freq_mlm_poisson_summary_fixedeffects_df,
       transform, p_value_sign = `Pr...z..` < 0.05)

如果我们需要多重比较,使用case_when

library(dplyr)
library(purrr)
library(stringr)
freq_mlm_poisson_summary_fixedeffects_df2 <- map(
   freq_mlm_poisson_summary_fixedeffects_df, ~ .x %>%
        mutate(p_value_sign = case_when(`Pr...z..` < 0.05 ~ 
          "significant at p < 0.05",
           ( `Pr...z..` < 0.10) & (`Pr...z..` >= 0.05) ~ 
           "marginally significant at p < 0.10", TRUE ~ "Not significant")))