如何使用 "dd-mm-yyyy hh:mm" 格式计算两个 as.characters 之间的小时数

How to calculate hours between two as.characters with the format "dd-mm-yyyy hh:mm"

我有 utime1time2 都以 dd-mm-yyyy hh:mm 格式列出。我想生成一个新的协变量,其中包含 u$time1u$time2 之间的 hours

它们被列为 as.character

str(u)
'data.frame':   5765 obs. of  2 variables:
 $ time1: chr  "30-01-2020 07:20" "25-04-2019 15:05" "11-01-2019 22:01" "11-01-2019 22:01" ...
 $ time2: chr  "14-02-2020 15:34" "27-04-2019 10:56" "12-01-2019 00:42" "23-01-2019 10:08" ...

预期输出

>  head(u)
             time1            time2                            new
1 30-01-2020 07:20 14-02-2020 15:34  hours between time1 and time2
2 25-04-2019 15:05 27-04-2019 10:56  hours between time1 and time2
3 11-01-2019 22:01 12-01-2019 00:42  hours between time1 and time2
4 11-01-2019 22:01 23-01-2019 10:08  hours between time1 and time2

如果小时数有一位小数,我会更喜欢,并且首选 dplyrlubridate 的解决方案。

u <- structure(list(time1 = c("30-01-2020 07:20", "25-04-2019 15:05", 
"11-01-2019 22:01", "11-01-2019 22:01", "17-04-2018 07:55"), 
    time2 = c("14-02-2020 15:34", "27-04-2019 10:56", "12-01-2019 00:42", 
    "23-01-2019 10:08", "20-04-2018 11:04")), row.names = c(NA, 
5L), class = "data.frame")

转换为POSIXct

后我们可以使用difftime
with(u, difftime(as.POSIXct(time1, format = '%d-%m-%Y %H:%M'), 
       as.POSIXct(time2, format = '%d-%m-%Y %H:%M'), units = 'hour'))

或者正如@r2evans 在评论中提到的那样,lubridate 选项是

library(lubridate)
as.numeric(dmy_hm(u$time1) - dmy_hm(u$time2), units = "hours")

这是一个非常快速的解决方案,使用 data.table

library(data.table)
library(lubridate)


setDT(u)
u[, time1 := dmy_hm(time1)]
u[, time2 := dmy_hm(time2)]

u[, diff := difftime(time1, time2, units = "hours")]

u
>                  time1               time2              diff
> 1: 2020-01-30 07:20:00 2020-02-14 15:34:00 -368.233333 hours
> 2: 2019-04-25 15:05:00 2019-04-27 10:56:00  -43.850000 hours
> 3: 2019-01-11 22:01:00 2019-01-12 00:42:00   -2.683333 hours
> 4: 2019-01-11 22:01:00 2019-01-23 10:08:00 -276.116667 hours
> 5: 2018-04-17 07:55:00 2018-04-20 11:04:00  -75.150000 hours