选择特定事件之后且在时差内的行

Question

我的代码无法正常工作。我想在上一行的 factor 的某个值之后获取下一行。但我只想在时间距离低于某个阈值时保存它，我想测试不同的变化。
下面是一个示例数据和我对 dplyr 的尝试。但是结果不是我想要的

library(dplyr)
library(lubridate)

# Setting up a data.frame
df_Whosebug <- data.frame(ID = c(1,1,
                                  2,2,2,2,
                                  3,3,
                                  4,4,
                                  5,5,5,
                                  6,6,6),
                           time = c("2016-11-11 10:25:07", "2016-11-11 11:13:09",
                                    "2016-11-16 21:13:28", "2016-11-29 19:18:58", "2016-12-01 18:44:41", "2016-12-04 12:46:44",
                                    "2016-12-26 20:49:07", "2016-12-30 11:41:51",
                                    "2016-11-25 10:25:52", "2016-12-26 22:04:36",
                                    "2016-12-07 21:27:53", "2016-12-07 21:52:58", "2016-12-09 18:32:23",
                                    "2016-11-25 14:10:24", "2016-11-25 20:06:43", "2016-11-25 21:07:33"),
                           Factor = c("A","B",
                                      "C","B","B","C",
                                      "B","B",
                                      "A","D",
                                      "D","D","D",
                                      "B","E","B"))

# My try to save a data.frame 
# I want to save all ros where the previous value for that ID was "B".
# And also the the time difference between this and the previous value need to be under a certain threshold.
# This threshold will be looped for different values
df_res <- list()
for(i in 1:15) {
  df_res[[i]] <- df_Whosebug %>%
  group_by(ID) %>%
  filter(lag(Factor) == "B" & as.period(interval(as.POSIXct(time), as.POSIXct(lag(time))), units = "day") < days(i))
}

有什么建议吗？

Answer 1

这是你想要的吗？

lapply(1:5,function(x) df_Whosebug %>% group_by(ID) %>%
                        mutate(differ = as.numeric(difftime(time,lag(time)),unit="days"),prev = lag(Factor)) %>%
                        filter(prev=="B" & differ < x))

通过分组 ID 获取因子和时间的滞后，并通过 lapply 在您想要的任何阈值上循环。结果存储在列表中。

选择特定事件之后且在时差内的行

Selecting rows following a certain event and within a time difference

r

date

lubridate

dplyr