计算R中的累积时间
Calculating cumulative time in R
我有一个如下所示的数据框:
POI LOCAL.DATETIME
1 1 2017-07-11 15:02:13
2 1 2017-07-11 15:20:28
3 2 2017-07-11 15:20:31
4 2 2017-07-11 15:21:13
5 3 2017-07-11 15:21:18
6 3 2017-07-11 15:21:21
7 2 2017-07-11 15:21:25
8 2 2017-07-11 15:21:59
9 1 2017-07-11 15:22:02
10 1 2017-07-11 15:22:05
我希望能够计算(可能使用 lubridate)在每个 POI 上花费的累计时间,并将它们组合成一个 table,看起来像这样:
POI TOTAL.TIME
1 1 00:18:18
2 2 00:01:11
3 3 00:00:03
此外,我不确定如何处理 POI 之间的时间,例如第 2 行和第 3 行之间的 3 秒。我想也许我需要计算从第 1 行到第 3 行的时间而不是第 1 行到第 2 行
要获取各组时段的总时间,首先需要创建一个组索引。我使用 data.table
中的 rleid
然后,您可以计算在每个组中花费的总时间,然后使用 sum
根据初始兴趣点进行总结。
df <- read.table(text=" POI LOCAL.DATETIME
1 '2017-07-11 15:02:13'
1 '2017-07-11 15:20:28'
2 '2017-07-11 15:20:31'
2 '2017-07-11 15:21:13'
3 '2017-07-11 15:21:18'
3 '2017-07-11 15:21:21'
2 '2017-07-11 15:21:25'
2 '2017-07-11 15:21:59'
1 '2017-07-11 15:22:02'
1 '2017-07-11 15:22:05'",
header=TRUE,stringsAsFactors=FALSE)
df$LOCAL.DATETIME <- as.POSIXct(df$LOCAL.DATETIME)
library(dplyr)
df%>%
mutate(grp=data.table::rleid(POI))%>%
group_by(grp)%>%
summarise(POI=max(POI),TOTAL.TIME=difftime(max(LOCAL.DATETIME),
min(LOCAL.DATETIME),units="secs"))%>%
group_by(POI)%>%
summarise(TOTAL.TIME=sum(TOTAL.TIME))
# A tibble: 3 × 2
POI TOTAL.TIME
<int> <time>
1 1 1098 secs
2 2 76 secs
3 3 3 secs
要获取分钟和秒,您可以使用 as.period
from lubridate
:
library(lubridate)
df%>%
mutate(grp=data.table::rleid(POI))%>%
group_by(grp)%>%
summarise(POI=max(POI),TOTAL.TIME=difftime(max(LOCAL.DATETIME),
min(LOCAL.DATETIME),units="secs"))%>%
group_by(POI)%>%
summarise(TOTAL.TIME=sum(TOTAL.TIME))%>%
mutate(TOTAL.TIME =as.period((TOTAL.TIME), unit = "sec"))
POI TOTAL.TIME
<int> <S4: Period>
1 1 18M 18S
2 2 1M 16S
3 3 3S
另一个 data.table
选项是为每个 POI
创建 2 行的分组,取它们之间的时间差,最后通过 POI
:
library(data.table)
dt <- as.data.table(df)
dt[, grp2 := (seq_len(.N)+1) %/% 2, by = POI]
dt[, time_diff := difftime(LOCAL.DATETIME, shift(LOCAL.DATETIME), unit = "min"), by = .(POI, grp2)]
dt[ , .(TOTAL.TIME = sum(time_diff, na.rm = T)), by = POI]
# POI TOTAL.TIME
#1: 1 18.300000 mins
#2: 2 1.266667 mins
#3: 3 0.050000 mins
我有一个如下所示的数据框:
POI LOCAL.DATETIME
1 1 2017-07-11 15:02:13
2 1 2017-07-11 15:20:28
3 2 2017-07-11 15:20:31
4 2 2017-07-11 15:21:13
5 3 2017-07-11 15:21:18
6 3 2017-07-11 15:21:21
7 2 2017-07-11 15:21:25
8 2 2017-07-11 15:21:59
9 1 2017-07-11 15:22:02
10 1 2017-07-11 15:22:05
我希望能够计算(可能使用 lubridate)在每个 POI 上花费的累计时间,并将它们组合成一个 table,看起来像这样:
POI TOTAL.TIME
1 1 00:18:18
2 2 00:01:11
3 3 00:00:03
此外,我不确定如何处理 POI 之间的时间,例如第 2 行和第 3 行之间的 3 秒。我想也许我需要计算从第 1 行到第 3 行的时间而不是第 1 行到第 2 行
要获取各组时段的总时间,首先需要创建一个组索引。我使用 data.table
中的 rleid
然后,您可以计算在每个组中花费的总时间,然后使用 sum
根据初始兴趣点进行总结。
df <- read.table(text=" POI LOCAL.DATETIME
1 '2017-07-11 15:02:13'
1 '2017-07-11 15:20:28'
2 '2017-07-11 15:20:31'
2 '2017-07-11 15:21:13'
3 '2017-07-11 15:21:18'
3 '2017-07-11 15:21:21'
2 '2017-07-11 15:21:25'
2 '2017-07-11 15:21:59'
1 '2017-07-11 15:22:02'
1 '2017-07-11 15:22:05'",
header=TRUE,stringsAsFactors=FALSE)
df$LOCAL.DATETIME <- as.POSIXct(df$LOCAL.DATETIME)
library(dplyr)
df%>%
mutate(grp=data.table::rleid(POI))%>%
group_by(grp)%>%
summarise(POI=max(POI),TOTAL.TIME=difftime(max(LOCAL.DATETIME),
min(LOCAL.DATETIME),units="secs"))%>%
group_by(POI)%>%
summarise(TOTAL.TIME=sum(TOTAL.TIME))
# A tibble: 3 × 2
POI TOTAL.TIME
<int> <time>
1 1 1098 secs
2 2 76 secs
3 3 3 secs
要获取分钟和秒,您可以使用 as.period
from lubridate
:
library(lubridate)
df%>%
mutate(grp=data.table::rleid(POI))%>%
group_by(grp)%>%
summarise(POI=max(POI),TOTAL.TIME=difftime(max(LOCAL.DATETIME),
min(LOCAL.DATETIME),units="secs"))%>%
group_by(POI)%>%
summarise(TOTAL.TIME=sum(TOTAL.TIME))%>%
mutate(TOTAL.TIME =as.period((TOTAL.TIME), unit = "sec"))
POI TOTAL.TIME
<int> <S4: Period>
1 1 18M 18S
2 2 1M 16S
3 3 3S
另一个 data.table
选项是为每个 POI
创建 2 行的分组,取它们之间的时间差,最后通过 POI
:
library(data.table)
dt <- as.data.table(df)
dt[, grp2 := (seq_len(.N)+1) %/% 2, by = POI]
dt[, time_diff := difftime(LOCAL.DATETIME, shift(LOCAL.DATETIME), unit = "min"), by = .(POI, grp2)]
dt[ , .(TOTAL.TIME = sum(time_diff, na.rm = T)), by = POI]
# POI TOTAL.TIME
#1: 1 18.300000 mins
#2: 2 1.266667 mins
#3: 3 0.050000 mins