基于子组过滤数据 R

Filter data based on subgroups R

实际上要复杂得多,但假设我的数据如下所示:

df <- data.frame(
      id = c(1,1,1,2,2,2,2,3,3,3),
      event = c(0,0,0,1,1,1,1,0,0,0),
      day = c(1,3,3,1,6,6,7,1,4,6),
      time = c("2016-10-25 14:00:00", "2016-10-27 12:00:15", "2016-10-27 15:30:00",
                "2016-10-23 11:00:00", "2016-10-28 08:00:15", "2016-10-28 23:00:00", "2016-10-29 12:00:00",
                "2016-10-24 15:00:00", "2016-10-27 15:00:15", "2016-10-29 16:00:00"))
df$time <- as.POSIXct(df$time)

Output:
   id event day                time
1   1     0   1 2016-10-25 14:00:00
2   1     0   3 2016-10-27 12:00:15
3   1     0   3 2016-10-27 15:30:00
4   2     1   1 2016-10-23 11:00:00
5   2     1   6 2016-10-28 08:00:15
6   2     1   6 2016-10-28 23:00:00
7   2     1   7 2016-10-29 12:00:00
8   3     0   1 2016-10-24 15:00:00
9   3     0   4 2016-10-27 15:00:15
10  3     0   6 2016-10-29 16:00:00

我需要做什么:

如果事件为 0,我只想保留每个 ID 的最后 24 小时。 如果事件是1,我想保留第6天。

我知道如何保留最后 24 小时:

library(lubridate)

last_twentyfour_hours <- df %>%                                      
  group_by(id) %>%                                                             
  filter(time > last(time) - hours(24))

但是我如何为每个组进行不同的过滤?

非常感谢您!

按 'id'、'event' 分组,用 if/else 做一个 filterif 0 在 'event' 中,然后使用OP 的条件或 else return 'day' 为 6

的行
library(dplyr)
library(lubridate)
df %>% 
   group_by(id, event) %>% 
   filter(if(0 %in% event) time > last(time) - hours(24) else 
        day == 6) %>% 
   ungroup

-输出

# A tibble: 5 × 4
     id event   day time               
  <dbl> <dbl> <dbl> <dttm>             
1     1     0     3 2016-10-27 12:00:15
2     1     0     3 2016-10-27 15:30:00
3     2     1     6 2016-10-28 08:00:15
4     2     1     6 2016-10-28 23:00:00
5     3     0     6 2016-10-29 16:00:00

我们可以使用 &| 运算符:

df %>%                                      
  group_by(id) %>%                                                             
  filter(event == 0 & time > last(time) - hours(24) |
           event == 1 & day==6)
     id event   day time               
  <dbl> <dbl> <dbl> <dttm>             
1     1     0     3 2016-10-27 12:00:15
2     1     0     3 2016-10-27 15:30:00
3     2     1     6 2016-10-28 08:00:15
4     2     1     6 2016-10-28 23:00:00
5     3     0     6 2016-10-29 16:00:00