创建一个逻辑变量来标识组中两个日期时间之间的最小差异的行

Creating a logical variable to identify the row within a group that is the minimum difference between two date-times

可重现的数据样本dput输出:

structure(list(id = c(1, 1, 1, 2, 3, 3, 4), 
start = structure(c(1546326000, 
1546326060, 1546326270, 1546722600, 1546884300, 1546884720,  
1547102430), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
event_time = structure(c(1546326059, 1546326059, 1546326059, 
1546722930, 1546884480, 1546884480, NA), 
tzone = "UTC", class = c("POSIXct", "POSIXt"))), 
.Names = c("id", "start", "event_time"), row.names = c(NA, -7L),
class = "data.frame")

我有一些来自不同来源的杂乱数据,我正在尝试创建一个新的逻辑变量,用于标识组 (id) 中的哪个观察结果具有 最不积极的startevent_time 变量之间的时间差 ,希望在 dplyr 内完成。

我尝试了几种方法,但找不到有效的方法。到目前为止,我正在考虑创建一个新变量来计算 eventstart 之间的时间差,或者如果该差为负则强制它为 NA,然后创建所需的变量离开这个。

代码:

dat %>% mutate(difference = ifelse(event_time > start, 
                                          event_time - start,
                                          NA)) %>%
    mutate(difference = as.integer(difference)) %>%
    group_by(id) %>%
    mutate(is_closest = row_number() == which.min(difference))

虽然这给了我一个错误,它不会创建变量 is_closest

我要寻找的最简单形式是:

检查这个解决方案:

library(lubridate)
library(dplyr)

dat %>%
  mutate(time_diff = start %--% event_time %>% as.numeric()) %>%
  group_by(id) %>%
  mutate(
    min_diff = time_diff[time_diff >= 0] %>% min(),
    min_diff_gr = time_diff == min_diff
  )