使用 tidyverse 进行条件过滤

Question

我想根据可能存在或不存在的变量过滤我的数据框。作为预期的输出，我想要一个被过滤的 df（如果它有过滤变量），或者原始的、未过滤的 df（如果变量丢失）。

这是一个最小的例子：

library(tidyverse)
df1 <- 
tribble(~a,~b,
        1L,"a",
        0L, "a",
        0L,"b",
        1L, "b")
df2 <- select(df1, b)

过滤 df1 returns 所需的结果，一个过滤的小标题。

filter(df1, a == 1)
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

但是第二个抛出错误（预期），因为变量不在 df 中。

filter(df2, a == 1)
Error in filter_impl(.data, quo) : 
  Evaluation error: object 'a' not found.

我试过filter_at，这将是一个显而易见的选择，但如果没有匹配困境的变量，它会抛出错误。

filter_at(df2, vars(matches("a")), any_vars(. == 1L))    
Error: `.predicate` has no matching columns

所以，我的问题是：有没有一种方法可以创建产生预期结果的条件过滤，最好是在 tidyverse 内？

Answer 1

是这样的吗？

# function for expected output
foo <- function(x, y){
  tmp <- which(colnames(x) %in% y)
  if(length(tmp) > 0){
    filter(x, select(x, tmp) == 1)
  }else{
    df1
  }
}

# run the functions
foo(df1, "a")
foo(df2, "a")
# or

df1 %>% foo("a")
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

df2 %>% foo("a")
# A tibble: 4 x 2
      a     b
  <int> <chr>
1     1     a
2     0     a
3     0     b
4     1     b

Answer 2

正如 @docendo-discimus 在评论中指出的那样，以下解决方案有效。我还添加了 rlang::has_name 而不是 "a" %in% names(.)。

本问答包含原意：.

df1 %>% 
   filter(if(has_name("a")) a == 1 else TRUE)
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

df2 %>% 
   filter(if(has_name("a")) a == 1 else TRUE)
# A tibble: 4 x 1
      b
  <chr>
1     a
2     a
3     b
4     b

或者，使用 {}:

df1 %>%
  {if(has_name("a")) filter(., a == 1L) else .} 
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

> df2 %>%
+   {if(has_name("a")) filter(., a == 1L) else .}
# A tibble: 4 x 1
      b
  <chr>
1     a
2     a
3     b
4     b

使用 tidyverse 进行条件过滤

Conditional filtering using tidyverse

r

dplyr

tidyverse