使用日期时间和 lubridate::dseconds 时 dplyr::summarise 出错

Error in dplyr::summarise when working with datetimes and lubridate::dseconds

我有一个小标题代表日志消息。它有(除其他外)两列:

我现在要做的是找到每个日志文件的开始时间、结束时间和持续时间(由FileCreationDateTime 标识)。我认为(或认为)这可以通过以下代码完成:

file_durations <- 
  logMessages%>%
  group_by(FileCreationDateTime) %>% 
  summarise(start = min(EventDateTime),
            end = max(EventDateTime),
            duration = dseconds(end - start))

代码本身似乎 运行 没有错误,但是我既不能打印结果也不能访问它(至少不是列 "duration")因为它 returns 错误

Error in sprintf("%ds (~%s %ss)", x, x2, unit, "s)") : 
  invalid format '%d'; use format %f, %e, %g or %a for numeric objects

经过调查,我发现错误似乎取决于日期时间的确切值。我把一个 MWE 和两个 tibbles 放在一起。这两个小标题只有一个值不同。一个有效,而另一个无效。我不知道是什么导致了错误。有没有大神能指教一下?

人类可读的小标题:

> working
# A tibble: 2 × 2
            EventDateTime FileCreationDateTime
                   <dttm>               <dttm>
1 2016-11-24 16:16:44.986  2016-11-24 16:16:46
2 2016-11-24 16:17:43.282  2016-11-24 16:16:46

> broken
# A tibble: 2 × 2
            EventDateTime FileCreationDateTime
                   <dttm>               <dttm>
1 2016-11-24 16:16:44.986  2016-11-24 16:16:46
2 2016-11-24 16:18:31.971  2016-11-24 16:16:46

完整的 MWE:

library(tidyverse)
library(lubridate)

options(digits.secs = 6, digits = 6)

working <- structure(list(EventDateTime = structure(c(1480004204.987, 1480004263.283),
                                                    class = c("POSIXct", "POSIXt"),
                                                    tzone = "UTC"),
                          FileCreationDateTime = structure(c(1480000606, 1480000606),
                                                           class = c("POSIXct", "POSIXt"),
                                                           tzone = "Europe/Vienna")),
                     .Names = c("EventDateTime", "FileCreationDateTime"),
                     row.names = c(NA, -2L),
                     class = c("tbl_df", "tbl", "data.frame"))

working %>%
  group_by(FileCreationDateTime) %>% 
  summarise(start = min(EventDateTime),
            end = max(EventDateTime),
            duration = dseconds(end - start))

broken  <- structure(list(EventDateTime = structure(c(1480004204.987, 1480004311.972),
                                                    class = c("POSIXct", "POSIXt"),
                                                    tzone = "UTC"),
                          FileCreationDateTime = structure(c(1480000606, 1480000606),
                                                           class = c("POSIXct", "POSIXt"),
                                                           tzone = "Europe/Vienna")),
                     .Names = c("EventDateTime", "FileCreationDateTime"),
                     row.names = c(NA, -2L),
                     class = c("tbl_df", "tbl", "data.frame"))

broken %>%
  group_by(FileCreationDateTime) %>% 
  summarise(start = min(EventDateTime),
            end = max(EventDateTime),
            duration = dseconds(end - start))

我在 Windows 10.

上使用 R 3.4.0 64 位,lubridate_1.6.0 和 dplyr_0.5.0

感谢您的帮助!

终于找到问题了。它与dplyr无关,而与lubridate::dseconds有关。正如已经报道的那样(例如 this issue),它在非整数输入 > 60 时失败。这显然也是我的问题。