在 R 中将值格式更改为标准 30 秒格式

change of value format to standard 30-second format in R

我希望将非标准值更改格式数据(仅在 Value 更改时读取)格式化为标准的 30 秒间隔格式。

我有:df:

Timestamp   Value
6/26/2018 0:00:06   10
6/26/2018 0:01:06   15
6/26/2018 0:02:15   20

dput

structure(list(Timestamp = c("6/26/2018 0:00:06", "6/26/2018 0:01:06", 
"6/26/2018 0:02:15"), Value = c(10L, 15L, 20L)), .Names = c("Timestamp", 
"Value"), class = "data.frame", row.names = c(NA, -3L))

我想要什么 formatted_df:

Timestamp   Value
6/26/2018 0:00:30   10
6/26/2018 0:01:00   10
6/26/2018 0:01:30   15
6/26/2018 0:02:00   15
6/26/2018 0:02:30   20

我的尝试:

使用 lubridatedplyr 中的函数,我得到的间隔是 30 秒的倍数,但它没有 标准化为 30秒数:

formatted <- df %>% mutate(Timestamp_Date = as.POSIXct(Timestamp, tz = "US/Eastern", usetz = TRUE, format="%m/%d/%Y %H:%M:%S"),
                           rounded_timestamp = ceiling_date(Timestamp_Date, unit = "30 seconds"))

formatted:

Timestamp   Value   Timestamp_Date  rounded_timestamp
6/26/2018 0:00:06   10  6/26/2018 0:00:06   6/26/2018 0:00:30
6/26/2018 0:01:06   15  6/26/2018 0:01:06   6/26/2018 0:01:30
6/26/2018 0:02:15   20  6/26/2018 0:02:15   6/26/2018 0:02:30

我认为 lubridatedplyr 在这里很有用,但我敢打赌 data.table 可以做到。

您可以使用 data.table 滚动连接。

library(data.table)

#convert df into data.table and Timestamp into POSIX format
setDT(df)[, Timestamp := as.POSIXct(Timestamp, format="%m/%d/%Y %H:%M:%S")]

#create the intervals of 30seconds according to needs
tstmp <- seq(as.POSIXct("2018-06-26 00:00:30", tz=""), 
    as.POSIXct("2018-06-26 00:02:30", tz=""), 
    by="30 sec")

#rolling join between intervals and df
df[.(Timestamp=tstmp), on=.(Timestamp), roll=Inf]

输出:

             Timestamp Value
1: 2018-06-26 00:00:30    10
2: 2018-06-26 00:01:00    10
3: 2018-06-26 00:01:30    15
4: 2018-06-26 00:02:00    15
5: 2018-06-26 00:02:30    20

有关详细信息,请阅读 ?data.table

中的 roll 参数