将列名传递给函数,用 R 编写,导入到单独的 R Markdown/R 文件中
Passing column names to a function, written in R, imported into a separate R Markdown/R File
好的,问题来了:我在 R 中编写了一个函数,它接受一个数据帧,然后使用 dplyr 和 tidyr 来操作和 return 另一个数据帧。
问题:我正在传递用引号、双引号或反引号 (``) 包裹的列名,但 R 抛出一条错误消息,指出“错误:未找到对象 'object name'”,在反引号的情况下,或“错误:必须按 .data
中找到的变量分组。* 未找到列 'column_variable`。”
第一块:我通过 source() 导入的函数
cat_perc_by_dept <- function(data, group, category) {
# Returns the percentage of each category for a specific group.
# Inputs:
# data: the dataset with multiple groups and multiple categories for each group
# group: the specific group, eg department, that we want to focus on within group_col
# category: the category we want to focus on within a particular group
data %>%
select(DeptSpecialty, category) %>%
filter(DeptSpecialty == group) %>%
group_by(DeptSpecialty) %>%
mutate(Total = n(), Instance = 1) %>%
group_by(category) %>%
summarise(Perc = sum(Instance) / Total) %>%
distinct()
}
第二个:我在主文件中用引号调用它的函数
cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = "ChargeDiagnosisCode")
并且有反引号:
cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = `ChargeDiagnosisCode`)
如何以不引发上述错误的方式将列名传递给函数?
如果输入是字符串
,我们可以在group_by
和select
中添加across
cat_perc_by_dept <- function(data, group, category) {
# Returns the percentage of each category for a specific group.
# Inputs:
# data: the dataset with multiple groups and multiple categories for each group
# group: the specific group, eg department, that we want to focus on within group_col
# category: the category we want to focus on within a particular group
data %>%
select(DeptSpecialty, across(all_of(category))) %>%
filter(DeptSpecialty == group) %>%
group_by(DeptSpecialty) %>%
mutate(Total = n(), Instance = 1) %>%
group_by(across(all_of(category))) %>%
summarise(Perc = sum(Instance) / Total) %>%
distinct()
}
或者另一种选择是使用 ensym
转换为 sym
bol 并求值 (!!
),这可以很灵活,因为可以传递不带引号和带引号的参数
cat_perc_by_dept <- function(data, group, category) {
# Returns the percentage of each category for a specific group.
# Inputs:
# data: the dataset with multiple groups and multiple categories for each group
# group: the specific group, eg department, that we want to focus on within group_col
# category: the category we want to focus on within a particular group
category <- rlang::ensym(category)
data %>%
select(DeptSpecialty, !!category) %>%
filter(DeptSpecialty == group) %>%
group_by(DeptSpecialty) %>%
mutate(Total = n(), Instance = 1) %>%
group_by(!!category) %>%
summarise(Perc = sum(Instance) / Total) %>%
distinct()
}
好的,问题来了:我在 R 中编写了一个函数,它接受一个数据帧,然后使用 dplyr 和 tidyr 来操作和 return 另一个数据帧。
问题:我正在传递用引号、双引号或反引号 (``) 包裹的列名,但 R 抛出一条错误消息,指出“错误:未找到对象 'object name'”,在反引号的情况下,或“错误:必须按 .data
中找到的变量分组。* 未找到列 'column_variable`。”
第一块:我通过 source() 导入的函数
cat_perc_by_dept <- function(data, group, category) {
# Returns the percentage of each category for a specific group.
# Inputs:
# data: the dataset with multiple groups and multiple categories for each group
# group: the specific group, eg department, that we want to focus on within group_col
# category: the category we want to focus on within a particular group
data %>%
select(DeptSpecialty, category) %>%
filter(DeptSpecialty == group) %>%
group_by(DeptSpecialty) %>%
mutate(Total = n(), Instance = 1) %>%
group_by(category) %>%
summarise(Perc = sum(Instance) / Total) %>%
distinct()
}
第二个:我在主文件中用引号调用它的函数
cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = "ChargeDiagnosisCode")
并且有反引号:
cat_perc_by_dept(data = visit_data, group = "Bariatrics", category = `ChargeDiagnosisCode`)
如何以不引发上述错误的方式将列名传递给函数?
如果输入是字符串
,我们可以在group_by
和select
中添加across
cat_perc_by_dept <- function(data, group, category) {
# Returns the percentage of each category for a specific group.
# Inputs:
# data: the dataset with multiple groups and multiple categories for each group
# group: the specific group, eg department, that we want to focus on within group_col
# category: the category we want to focus on within a particular group
data %>%
select(DeptSpecialty, across(all_of(category))) %>%
filter(DeptSpecialty == group) %>%
group_by(DeptSpecialty) %>%
mutate(Total = n(), Instance = 1) %>%
group_by(across(all_of(category))) %>%
summarise(Perc = sum(Instance) / Total) %>%
distinct()
}
或者另一种选择是使用 ensym
转换为 sym
bol 并求值 (!!
),这可以很灵活,因为可以传递不带引号和带引号的参数
cat_perc_by_dept <- function(data, group, category) {
# Returns the percentage of each category for a specific group.
# Inputs:
# data: the dataset with multiple groups and multiple categories for each group
# group: the specific group, eg department, that we want to focus on within group_col
# category: the category we want to focus on within a particular group
category <- rlang::ensym(category)
data %>%
select(DeptSpecialty, !!category) %>%
filter(DeptSpecialty == group) %>%
group_by(DeptSpecialty) %>%
mutate(Total = n(), Instance = 1) %>%
group_by(!!category) %>%
summarise(Perc = sum(Instance) / Total) %>%
distinct()
}