如何让每行周围的 k 行在 R 数据帧的每个方向上满足给定条件?

How can I get the k rows surrounding each row meeting a given condition in each direction of an R data frame?

dplyr 方案优先。

假设我有以下数据:

library(tibble)

frame_data(
~a, ~b, ~c, ~d, ~e,
1, 2, 3, 4, FALSE,
5, 6, 7,8, TRUE,
9, 10, 11, 12, TRUE,
13, 14, 15, 16, FALSE,
17, 18, 19, 20, FALSE,
21, 22, 23, 24, FALSE,
25, 26, 27, 28, TRUE,
29, 30, 31, 32, FALSE,
33, 34, 35, 36, FALSE,
37, 38, 39, 40, FALSE
)

我希望提取 e 中值为 TRUE 的行,然后还提取 e 行周围的 k 行中的 window =] 在两个方向上都为 TRUE,与 e 中的值无关。例如,如果k=1,我想要:

frame_data(
1, 2, 3, 4, FALSE,
5, 6, 7,8, TRUE,
9, 10, 11, 12, TRUE,
13, 14, 15, 16, FALSE,
21, 22, 23, 24, FALSE,
25, 26, 27, 28, TRUE,
29, 30, 31, 32, FALSE
)

如果 k=2,我想要:

frame_data(
~a, ~b, ~c, ~d, ~e,
1, 2, 3, 4, FALSE,
5, 6, 7,8, TRUE,
9, 10, 11, 12, TRUE,
13, 14, 15, 16, FALSE,
17, 18, 19, 20, FALSE,
21, 22, 23, 24, FALSE,
25, 26, 27, 28, TRUE,
29, 30, 31, 32, FALSE,
33, 34, 35, 36, FALSE
)

这是一个可能的解决方案:

#selection window size
k<-1

#find row numbers
foundrows<-which(dat$e)
#create row index based on found row +- window size
selectedRows<-unlist(lapply(foundrows, function(z){seq(z-k, z+k)}))
#remove overlaps and out of bounds subscripts 
selectedRows<-sort(unique(selectedRows))
selectedRows<-selectedRows[selectedRows>0 & selectedRows<=nrow(dat)]

dat[selectedRows,]

不如使用 lat/lead 函数那么直接,但它确实允许轻松调整 window 大小。它使用基数 R 并将行索引限制在数据帧的范围内。