将列名传递给函数,用 R 编写,导入到单独的 R Markdown/R 文件中

Passing column names to a function, written in R, imported into a separate R Markdown/R File

好的,问题来了:我在 R 中编写了一个函数,它接受一个数据帧,然后使用 dplyr 和 tidyr 来操作和 return 另一个数据帧。

问题:我正在传递用引号、双引号或反引号 (``) 包裹的列名,但 R 抛出一条错误消息,指出“错误:未找到对象 'object name'”,在反引号的情况下,或“错误:必须按 .data 中找到的变量分组。* 未找到列 'column_variable`。”

第一块:我通过 source() 导入的函数

cat_perc_by_dept <- function(data, group, category) {
  # Returns the percentage of each category for a specific group.  
  # Inputs: 
  #   data: the dataset with multiple groups and multiple categories for each group
  #   group: the specific group, eg department, that we want to focus on within group_col
  #   category: the category we want to focus on within a particular group
  data %>% 
    select(DeptSpecialty, category) %>% 
    filter(DeptSpecialty == group) %>% 
    group_by(DeptSpecialty) %>% 
    mutate(Total = n(), Instance = 1) %>% 
    group_by(category) %>% 
    summarise(Perc = sum(Instance) / Total) %>% 
    distinct()
}

第二个:我在主文件中用引号调用它的函数

cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = "ChargeDiagnosisCode")

并且有反引号:

cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = `ChargeDiagnosisCode`)

如何以不引发上述错误的方式将列名传递给函数?

如果输入是字符串

,我们可以在group_byselect中添加across
cat_perc_by_dept <- function(data, group, category) {
  # Returns the percentage of each category for a specific group.  
  # Inputs: 
  #   data: the dataset with multiple groups and multiple categories for each group
  #   group: the specific group, eg department, that we want to focus on within group_col
  #   category: the category we want to focus on within a particular group
  data %>% 
    select(DeptSpecialty, across(all_of(category))) %>% 
    filter(DeptSpecialty == group) %>% 
    group_by(DeptSpecialty) %>% 
    mutate(Total = n(), Instance = 1) %>% 
    group_by(across(all_of(category))) %>% 
    summarise(Perc = sum(Instance) / Total) %>% 
    distinct()
}

或者另一种选择是使用 ensym 转换为 symbol 并求值 (!!),这可以很灵活,因为可以传递不带引号和带引号的参数

cat_perc_by_dept <- function(data, group, category) {
  # Returns the percentage of each category for a specific group.  
  # Inputs: 
  #   data: the dataset with multiple groups and multiple categories for each group
  #   group: the specific group, eg department, that we want to focus on within group_col
  #   category: the category we want to focus on within a particular group

  
 category <- rlang::ensym(category)
  data %>% 
    select(DeptSpecialty, !!category) %>% 
    filter(DeptSpecialty == group) %>% 
    group_by(DeptSpecialty) %>% 
    mutate(Total = n(), Instance = 1) %>% 
    group_by(!!category) %>% 
    summarise(Perc = sum(Instance) / Total) %>% 
    distinct()
}