将列名传递给函数，用 R 编写，导入到单独的 R Markdown/R 文件中

Question

好的，问题来了：我在 R 中编写了一个函数，它接受一个数据帧，然后使用 dplyr 和 tidyr 来操作和 return 另一个数据帧。

问题：我正在传递用引号、双引号或反引号 (``) 包裹的列名，但 R 抛出一条错误消息，指出“错误：未找到对象 'object name'”，在反引号的情况下，或“错误：必须按 .data 中找到的变量分组。* 未找到列 'column_variable`。”

第一块：我通过 source() 导入的函数

cat_perc_by_dept <- function(data, group, category) {
  # Returns the percentage of each category for a specific group.  
  # Inputs: 
  #   data: the dataset with multiple groups and multiple categories for each group
  #   group: the specific group, eg department, that we want to focus on within group_col
  #   category: the category we want to focus on within a particular group
  data %>% 
    select(DeptSpecialty, category) %>% 
    filter(DeptSpecialty == group) %>% 
    group_by(DeptSpecialty) %>% 
    mutate(Total = n(), Instance = 1) %>% 
    group_by(category) %>% 
    summarise(Perc = sum(Instance) / Total) %>% 
    distinct()
}

第二个：我在主文件中用引号调用它的函数

cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = "ChargeDiagnosisCode")

并且有反引号：

cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = `ChargeDiagnosisCode`)

如何以不引发上述错误的方式将列名传递给函数？

Answer 1

如果输入是字符串

，我们可以在group_by和select中添加across

cat_perc_by_dept <- function(data, group, category) {
  # Returns the percentage of each category for a specific group.  
  # Inputs: 
  #   data: the dataset with multiple groups and multiple categories for each group
  #   group: the specific group, eg department, that we want to focus on within group_col
  #   category: the category we want to focus on within a particular group
  data %>% 
    select(DeptSpecialty, across(all_of(category))) %>% 
    filter(DeptSpecialty == group) %>% 
    group_by(DeptSpecialty) %>% 
    mutate(Total = n(), Instance = 1) %>% 
    group_by(across(all_of(category))) %>% 
    summarise(Perc = sum(Instance) / Total) %>% 
    distinct()
}

或者另一种选择是使用 ensym 转换为 symbol 并求值 (!!)，这可以很灵活，因为可以传递不带引号和带引号的参数

cat_perc_by_dept <- function(data, group, category) {
  # Returns the percentage of each category for a specific group.  
  # Inputs: 
  #   data: the dataset with multiple groups and multiple categories for each group
  #   group: the specific group, eg department, that we want to focus on within group_col
  #   category: the category we want to focus on within a particular group

  
 category <- rlang::ensym(category)
  data %>% 
    select(DeptSpecialty, !!category) %>% 
    filter(DeptSpecialty == group) %>% 
    group_by(DeptSpecialty) %>% 
    mutate(Total = n(), Instance = 1) %>% 
    group_by(!!category) %>% 
    summarise(Perc = sum(Instance) / Total) %>% 
    distinct()
}

将列名传递给函数，用 R 编写，导入到单独的 R Markdown/R 文件中

Passing column names to a function, written in R, imported into a separate R Markdown/R File

r

function

dplyr

tidyr