从 data.table 库中 round/floor ITime 格式时间的有效方法是什么

What is the efficient way to round/floor ITime-formated time from data.table library

data.table 库中 round/floor ITime 格式化时间的有效方法是什么?

我转换的方式是:先转换成POSIXct,然后将结果floor,再转换回ITime。例子

library(lubridate)
library(data.table)

# Suppose I have some ITime variable:
Time = as.ITime( Sys.time() )

#That's what I do:
as.ITime( floor_date( as.POSIXct( Time ), "5 minutes"), format = "%H:%M:%S")

#Result:
[1] "16:05:00"

工作正常,但由于双重转换似乎效率不高。有更好的选择吗?

您可以利用 ITime 变量在内部存储为整数(秒数)这一事实。

library(lubridate)
library(data.table)

# Let generate an ITime variable

set.seed(233)
y <- as.ITime(sample(60*60*24, size = 1e6, replace = TRUE))  # 60*60*24: max number of seconds in a day 

因为 5 分钟是 60 * 5 秒(300 秒),您可以将变量除以 300,取其底数,然后再乘以 300。您可以使用整数除法运算符 %/%,对于前两个步骤。

# head of the data using this method  and the one you suggested:
head(data.table(
    y = y,
    method1 = (y %/% 300L) * 300L,
    method2 = as.ITime( floor_date( as.POSIXct( y ), "5 minutes"), format = "%H:%M:%S")),
    n = 10)
  
           y  method1  method2
 1: 13:21:33 13:20:00 13:20:00
 2: 13:24:11 13:20:00 13:20:00
 3: 18:02:47 18:00:00 18:00:00
 4: 20:06:51 20:05:00 20:05:00
 5: 19:59:35 19:55:00 19:55:00
 6: 16:35:46 16:35:00 16:35:00
 7: 16:32:10 16:30:00 16:30:00
 8: 15:57:35 15:55:00 15:55:00
 9: 01:21:16 01:20:00 01:20:00
10: 17:10:09 17:10:00 17:10:00

时机

microbenchmark::microbenchmark(
  method1 = (y %/% 300L) * 300L,
  method2 = as.ITime( floor_date( as.POSIXct( y ), "5 minutes"), format = "%H:%M:%S"),
  times = 5L
)

Unit: milliseconds
    expr      min       lq      mean   median       uq      max neval
 method1   7.5192   7.7691   8.23544   8.0286   8.8695   8.9908     5
 method2 396.5867 404.5420 418.07694 412.6798 436.3783 440.1979     5