选择特定事件之后且在时差内的行
Selecting rows following a certain event and within a time difference
我的代码无法正常工作。我想在上一行的 factor
的某个值之后获取下一行。但我只想在时间距离低于某个阈值时保存它,我想测试不同的变化。
下面是一个示例数据和我对 dplyr 的尝试。但是结果不是我想要的
library(dplyr)
library(lubridate)
# Setting up a data.frame
df_Whosebug <- data.frame(ID = c(1,1,
2,2,2,2,
3,3,
4,4,
5,5,5,
6,6,6),
time = c("2016-11-11 10:25:07", "2016-11-11 11:13:09",
"2016-11-16 21:13:28", "2016-11-29 19:18:58", "2016-12-01 18:44:41", "2016-12-04 12:46:44",
"2016-12-26 20:49:07", "2016-12-30 11:41:51",
"2016-11-25 10:25:52", "2016-12-26 22:04:36",
"2016-12-07 21:27:53", "2016-12-07 21:52:58", "2016-12-09 18:32:23",
"2016-11-25 14:10:24", "2016-11-25 20:06:43", "2016-11-25 21:07:33"),
Factor = c("A","B",
"C","B","B","C",
"B","B",
"A","D",
"D","D","D",
"B","E","B"))
# My try to save a data.frame
# I want to save all ros where the previous value for that ID was "B".
# And also the the time difference between this and the previous value need to be under a certain threshold.
# This threshold will be looped for different values
df_res <- list()
for(i in 1:15) {
df_res[[i]] <- df_Whosebug %>%
group_by(ID) %>%
filter(lag(Factor) == "B" & as.period(interval(as.POSIXct(time), as.POSIXct(lag(time))), units = "day") < days(i))
}
有什么建议吗?
这是你想要的吗?
lapply(1:5,function(x) df_Whosebug %>% group_by(ID) %>%
mutate(differ = as.numeric(difftime(time,lag(time)),unit="days"),prev = lag(Factor)) %>%
filter(prev=="B" & differ < x))
通过分组 ID 获取因子和时间的滞后,并通过 lapply
在您想要的任何阈值上循环。结果存储在列表中。
我的代码无法正常工作。我想在上一行的 factor
的某个值之后获取下一行。但我只想在时间距离低于某个阈值时保存它,我想测试不同的变化。
下面是一个示例数据和我对 dplyr 的尝试。但是结果不是我想要的
library(dplyr)
library(lubridate)
# Setting up a data.frame
df_Whosebug <- data.frame(ID = c(1,1,
2,2,2,2,
3,3,
4,4,
5,5,5,
6,6,6),
time = c("2016-11-11 10:25:07", "2016-11-11 11:13:09",
"2016-11-16 21:13:28", "2016-11-29 19:18:58", "2016-12-01 18:44:41", "2016-12-04 12:46:44",
"2016-12-26 20:49:07", "2016-12-30 11:41:51",
"2016-11-25 10:25:52", "2016-12-26 22:04:36",
"2016-12-07 21:27:53", "2016-12-07 21:52:58", "2016-12-09 18:32:23",
"2016-11-25 14:10:24", "2016-11-25 20:06:43", "2016-11-25 21:07:33"),
Factor = c("A","B",
"C","B","B","C",
"B","B",
"A","D",
"D","D","D",
"B","E","B"))
# My try to save a data.frame
# I want to save all ros where the previous value for that ID was "B".
# And also the the time difference between this and the previous value need to be under a certain threshold.
# This threshold will be looped for different values
df_res <- list()
for(i in 1:15) {
df_res[[i]] <- df_Whosebug %>%
group_by(ID) %>%
filter(lag(Factor) == "B" & as.period(interval(as.POSIXct(time), as.POSIXct(lag(time))), units = "day") < days(i))
}
有什么建议吗?
这是你想要的吗?
lapply(1:5,function(x) df_Whosebug %>% group_by(ID) %>%
mutate(differ = as.numeric(difftime(time,lag(time)),unit="days"),prev = lag(Factor)) %>%
filter(prev=="B" & differ < x))
通过分组 ID 获取因子和时间的滞后,并通过 lapply
在您想要的任何阈值上循环。结果存储在列表中。