如何使用 "dd-mm-yyyy hh:mm" 格式计算两个 as.characters 之间的小时数
How to calculate hours between two as.characters with the format "dd-mm-yyyy hh:mm"
我有 u
,time1
和 time2
都以 dd-mm-yyyy hh:mm
格式列出。我想生成一个新的协变量,其中包含 u$time1
和 u$time2
之间的 hours
。
它们被列为 as.character
str(u)
'data.frame': 5765 obs. of 2 variables:
$ time1: chr "30-01-2020 07:20" "25-04-2019 15:05" "11-01-2019 22:01" "11-01-2019 22:01" ...
$ time2: chr "14-02-2020 15:34" "27-04-2019 10:56" "12-01-2019 00:42" "23-01-2019 10:08" ...
预期输出
> head(u)
time1 time2 new
1 30-01-2020 07:20 14-02-2020 15:34 hours between time1 and time2
2 25-04-2019 15:05 27-04-2019 10:56 hours between time1 and time2
3 11-01-2019 22:01 12-01-2019 00:42 hours between time1 and time2
4 11-01-2019 22:01 23-01-2019 10:08 hours between time1 and time2
如果小时数有一位小数,我会更喜欢,并且首选 dplyr
或 lubridate
的解决方案。
u <- structure(list(time1 = c("30-01-2020 07:20", "25-04-2019 15:05",
"11-01-2019 22:01", "11-01-2019 22:01", "17-04-2018 07:55"),
time2 = c("14-02-2020 15:34", "27-04-2019 10:56", "12-01-2019 00:42",
"23-01-2019 10:08", "20-04-2018 11:04")), row.names = c(NA,
5L), class = "data.frame")
转换为POSIXct
后我们可以使用difftime
with(u, difftime(as.POSIXct(time1, format = '%d-%m-%Y %H:%M'),
as.POSIXct(time2, format = '%d-%m-%Y %H:%M'), units = 'hour'))
或者正如@r2evans 在评论中提到的那样,lubridate
选项是
library(lubridate)
as.numeric(dmy_hm(u$time1) - dmy_hm(u$time2), units = "hours")
这是一个非常快速的解决方案,使用 data.table
library(data.table)
library(lubridate)
setDT(u)
u[, time1 := dmy_hm(time1)]
u[, time2 := dmy_hm(time2)]
u[, diff := difftime(time1, time2, units = "hours")]
u
> time1 time2 diff
> 1: 2020-01-30 07:20:00 2020-02-14 15:34:00 -368.233333 hours
> 2: 2019-04-25 15:05:00 2019-04-27 10:56:00 -43.850000 hours
> 3: 2019-01-11 22:01:00 2019-01-12 00:42:00 -2.683333 hours
> 4: 2019-01-11 22:01:00 2019-01-23 10:08:00 -276.116667 hours
> 5: 2018-04-17 07:55:00 2018-04-20 11:04:00 -75.150000 hours
我有 u
,time1
和 time2
都以 dd-mm-yyyy hh:mm
格式列出。我想生成一个新的协变量,其中包含 u$time1
和 u$time2
之间的 hours
。
它们被列为 as.character
str(u)
'data.frame': 5765 obs. of 2 variables:
$ time1: chr "30-01-2020 07:20" "25-04-2019 15:05" "11-01-2019 22:01" "11-01-2019 22:01" ...
$ time2: chr "14-02-2020 15:34" "27-04-2019 10:56" "12-01-2019 00:42" "23-01-2019 10:08" ...
预期输出
> head(u)
time1 time2 new
1 30-01-2020 07:20 14-02-2020 15:34 hours between time1 and time2
2 25-04-2019 15:05 27-04-2019 10:56 hours between time1 and time2
3 11-01-2019 22:01 12-01-2019 00:42 hours between time1 and time2
4 11-01-2019 22:01 23-01-2019 10:08 hours between time1 and time2
如果小时数有一位小数,我会更喜欢,并且首选 dplyr
或 lubridate
的解决方案。
u <- structure(list(time1 = c("30-01-2020 07:20", "25-04-2019 15:05",
"11-01-2019 22:01", "11-01-2019 22:01", "17-04-2018 07:55"),
time2 = c("14-02-2020 15:34", "27-04-2019 10:56", "12-01-2019 00:42",
"23-01-2019 10:08", "20-04-2018 11:04")), row.names = c(NA,
5L), class = "data.frame")
转换为POSIXct
difftime
with(u, difftime(as.POSIXct(time1, format = '%d-%m-%Y %H:%M'),
as.POSIXct(time2, format = '%d-%m-%Y %H:%M'), units = 'hour'))
或者正如@r2evans 在评论中提到的那样,lubridate
选项是
library(lubridate)
as.numeric(dmy_hm(u$time1) - dmy_hm(u$time2), units = "hours")
这是一个非常快速的解决方案,使用 data.table
library(data.table)
library(lubridate)
setDT(u)
u[, time1 := dmy_hm(time1)]
u[, time2 := dmy_hm(time2)]
u[, diff := difftime(time1, time2, units = "hours")]
u
> time1 time2 diff
> 1: 2020-01-30 07:20:00 2020-02-14 15:34:00 -368.233333 hours
> 2: 2019-04-25 15:05:00 2019-04-27 10:56:00 -43.850000 hours
> 3: 2019-01-11 22:01:00 2019-01-12 00:42:00 -2.683333 hours
> 4: 2019-01-11 22:01:00 2019-01-23 10:08:00 -276.116667 hours
> 5: 2018-04-17 07:55:00 2018-04-20 11:04:00 -75.150000 hours