根据外部值有条件地应用流水线步骤
Conditionally apply pipeline step depending on external value
鉴于 dplyr 工作流程:
require(dplyr)
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(grepl(x = model, pattern = "Merc")) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
我有兴趣根据 applyFilter
的值有条件地应用 filter
。
解决方案
对于 applyFilter <- 1
,使用 "Merc"
字符串过滤行,不使用过滤器 返回所有 行。
applyFilter <- 1
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(model %in%
if (applyFilter) {
rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
} else
{
rownames(mtcars)
}) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
问题
建议的解决方案效率低下,因为总是评估 ifelse
调用;一种更理想的方法只会评估 applyFilter <- 1
.
的 filter
步骤
尝试
低效 工作解决方案如下所示:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
# Only apply filter step if condition is met
if (applyFilter) {
filter(grepl(x = model, pattern = "Merc"))
}
%>%
# Continue
group_by(am) %>%
summarise(meanMPG = mean(mpg))
当然,上面的语法是不正确的。这只是理想工作流程的示例。
想要的答案
我对创建临时对象不感兴趣;工作流程应类似于:
startingObject
%>%
...
conditional filter
...
final object
理想情况下,我想找到可以控制是否正在评估 filter
调用的解决方案
这种方法怎么样:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(if(applyfilter== 1) grepl(x = model, pattern = "Merc") else TRUE) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
这意味着 grepl
仅在 applyfilter 为 1 时才被评估,否则 filter
只是回收 TRUE
.
或者另一种选择是使用 {}
:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
{if(applyfilter == 1) filter(., grepl(x = model, pattern = "Merc")) else .} %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
显然还有另一种可能的方法,您可以简单地打破管道,有条件地进行过滤器然后继续管道(我知道 OP 没有要求这样做,只是想为其他读者举另一个例子)
mtcars %<>%
tibble::rownames_to_column(var = "model")
if(applyfilter == 1) mtcars %<>% filter(grepl(x = model, pattern = "Merc"))
mtcars %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
鉴于 dplyr 工作流程:
require(dplyr)
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(grepl(x = model, pattern = "Merc")) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
我有兴趣根据 applyFilter
的值有条件地应用 filter
。
解决方案
对于 applyFilter <- 1
,使用 "Merc"
字符串过滤行,不使用过滤器 返回所有 行。
applyFilter <- 1
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(model %in%
if (applyFilter) {
rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
} else
{
rownames(mtcars)
}) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
问题
建议的解决方案效率低下,因为总是评估 ifelse
调用;一种更理想的方法只会评估 applyFilter <- 1
.
filter
步骤
尝试
低效 工作解决方案如下所示:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
# Only apply filter step if condition is met
if (applyFilter) {
filter(grepl(x = model, pattern = "Merc"))
}
%>%
# Continue
group_by(am) %>%
summarise(meanMPG = mean(mpg))
当然,上面的语法是不正确的。这只是理想工作流程的示例。
想要的答案
我对创建临时对象不感兴趣;工作流程应类似于:
startingObject %>% ... conditional filter ... final object
理想情况下,我想找到可以控制是否正在评估
filter
调用的解决方案
这种方法怎么样:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(if(applyfilter== 1) grepl(x = model, pattern = "Merc") else TRUE) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
这意味着 grepl
仅在 applyfilter 为 1 时才被评估,否则 filter
只是回收 TRUE
.
或者另一种选择是使用 {}
:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
{if(applyfilter == 1) filter(., grepl(x = model, pattern = "Merc")) else .} %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
显然还有另一种可能的方法,您可以简单地打破管道,有条件地进行过滤器然后继续管道(我知道 OP 没有要求这样做,只是想为其他读者举另一个例子)
mtcars %<>%
tibble::rownames_to_column(var = "model")
if(applyfilter == 1) mtcars %<>% filter(grepl(x = model, pattern = "Merc"))
mtcars %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))