应用条件 if val 连续分钟

apply condition if val for continuous minutes

我有这个数据:

library(tidyverse)
library(lubridate)

dates <- c("01/01/18 1:00:00 PM" ,"01/01/18 1:01:00 PM",
           "01/01/18 1:02:00 PM" ,"01/01/18 1:03:00 PM",
           "01/01/18 1:04:00 PM" ,"01/01/18 1:05:00 PM",
           "01/01/18 1:06:00 PM" ,"01/01/18 1:07:00 PM",
           "01/01/18 1:08:00 PM" ,"01/01/18 1:09:00 PM",
           "01/01/18 1:10:00 PM" ,"01/01/18 1:11:00 PM")

vals <- c(1, 2, 3, 3, 15, 16, 17, 18, 1, 2, 1, 22)

datfr <- data.frame(dates, vals)

datfr$dates <- dmy_hms(datfr$dates)

我要申请一个条件:

if the val is < 4 for 2 continuous minutes period then true

我试过了:

datfr$gr <- datfr %>%
       group_by(by2min = cut(dates, "2 min")) %>%
       summarise(cond = (vals < 4))

但它给了我:

column cond must be length 1 not 2

我不确定该方法。

所以,我的预期输出:

dates                   vals     cond

1 2018-01-01 13:00:00     1      
2 2018-01-01 13:01:00     2      
3 2018-01-01 13:02:00     3      TRUE
4 2018-01-01 13:03:00     3     
5 2018-01-01 13:04:00    15      FALSE
6 2018-01-01 13:05:00    16     
7 2018-01-01 13:06:00    17      FALSE
8 2018-01-01 13:07:00    18      
9 2018-01-01 13:08:00     1      FALSE
10 2018-01-01 13:09:00    2      
11 2018-01-01 13:10:00    1      TRUE
12 2018-01-01 13:11:00   22      

因此,如果 val 连续 2 分钟小于 4,则为真。

使用rollapply怎么样?

zoo::rollapply(datfr$vals, 3, by = 1, function(x) sum(x<4) == 2)

编辑:简化函数

假设您的数据格式为行条目之间的时间差为 1 分钟

datfr$cond<-
zoo::rollapply(data = datfr$vals, width = 3, FUN = function(x) { if (all(x < 4)) return(TRUE) else return(FALSE) }, align = "right", fill = FALSE)

结果:

#                 dates vals  cond
#1  2018-01-01 13:00:00    1 FALSE
#2  2018-01-01 13:01:00    2 FALSE
#3  2018-01-01 13:02:00    3  TRUE
#4  2018-01-01 13:03:00    3  TRUE
#5  2018-01-01 13:04:00   15 FALSE
#6  2018-01-01 13:05:00   16 FALSE
#7  2018-01-01 13:06:00   17 FALSE
#8  2018-01-01 13:07:00   18 FALSE
#9  2018-01-01 13:08:00    1 FALSE
#10 2018-01-01 13:09:00    2 FALSE
#11 2018-01-01 13:10:00    1  TRUE
#12 2018-01-01 13:11:00   22 FALSE

我已尝试尽可能重现您想要的输出。我假设 cond 的空元素是 NA。如果 cond 是一个 character 变量,而空元素表示 \s,则很容易通过添加额外的 mutate(cond = coalesce(as.character(cond), "")) 来调整输出。我将最后一个值转换为 \s/NA.

失败
#library(tidyverse)

datfr %>%
  arrange(dates) %>%
  group_by(by2min = lag(cut(c(min(dates), dates), "2 min"))[-1]) %>%
  mutate(dates = max(dates)) %>%
  group_by(dates) %>%
  summarise(cond = all(vals < 4), vals = last(vals)) %>%
  right_join(datfr, by = c('dates', 'vals')) %>%
  select(dates, vals, cond)

# # A tibble: 12 x 3
#   dates                vals cond 
#   <dttm>              <dbl> <lgl>
# 1 2018-01-01 13:00:00     1 NA   
# 2 2018-01-01 13:01:00     2 NA   
# 3 2018-01-01 13:02:00     3 TRUE 
# 4 2018-01-01 13:03:00     3 NA   
# 5 2018-01-01 13:04:00    15 FALSE
# 6 2018-01-01 13:05:00    16 NA   
# 7 2018-01-01 13:06:00    17 FALSE
# 8 2018-01-01 13:07:00    18 NA   
# 9 2018-01-01 13:08:00     1 FALSE
#10 2018-01-01 13:09:00     2 NA   
#11 2018-01-01 13:10:00     1 TRUE 
#12 2018-01-01 13:11:00    22 FALSE