如何离散化小时时间?

How to discretize hour-times?

我有一些时间值,我试图根据 3 个类别来离散化:早上 (4.30, 12.00),晚上 (12.00, 21.00),晚上 (21.00, 4.30)

首先,我尝试使用 lubridate

将字符向量转换为字符
library(lubridate)
h <- hm(c("14:30", "02:10", "06:30", "14:50", "20:30", "21:00", "12:00", "23:30", "08:10", "00:00"))

现在我需要离散化 h

我通常会使用 cut,但它在这里似乎不起作用:

cut(h, breaks = hm(c('4.30', '12.00', '21.00')), levels = c('morning', 'evening', 'night'))

lubridate有具体的功能吗?

我们可以将其转换为times对象

library(chron)
t1 <- times(paste0(v1, ":00"))

然后通过将 breaks 指定为 times

来执行 cut
cut(t1, breaks = times(c('04:30:00', '12:00:00', 
            '21:00:00', '21:00:01')), labels = c('morning', 'evening', 'night'))
#[1] evening <NA>    morning evening evening evening morning <NA>    morning <NA>   
#Levels: morning evening night

#Levels: morning evening night

或者这可以通过 base R

中的 strptime 来完成
res <- cut(strptime(v1, format = "%H:%M"), breaks = strptime(c("04:30", "12:00", 
   "21:00", "21:01"), format = "%H:%M"), 
   labels = c("morning", "evening", "night"))
res[is.na(res)] <- "night"
res
#[1] evening night   morning evening evening night   evening night   morning night  
#Levels: morning evening night

数据

v1 <- c("14:30", "02:10", "06:30", "14:50", "20:30",
            "21:00", "12:00", "23:30", "08:10", "00:00")    

您可以使用基数 R:

中的 findInterval
breaks=strptime(c("0.00","4.00","12.00","21.00","23.59"),"%H.%M")
labels=c("night","morning","evening","night")
labels[findInterval(strptime(dat,"%H:%M"),breaks)]
 [1] "evening" "night"   "morning" "evening" "evening" "night"   "evening"
 [8] "night"   "morning" "night"  

在哪里

 dat <- c("14:30", "02:10", "06:30", "14:50", "20:30",
        "21:00", "12:00", "23:30", "08:10", "00:00")    

我们可以看到dat 2已经被赋予了night

另一种方法是将时间转化为数字,然后就可以使用arules中的discretize函数了。这可以灵活地与日期等一起使用。

require(arules)
h <- data.frame(V1=(c("14:30", "02:10", "06:30", "14:50", "20:30", "21:00", "12:00", "23:30", "08:10", "00:00")))
h$V2<- gsub("\:", "", h$V1)
h$discrete=discretize(h$V2,method="fixed",categories=c(0,1430,1200,2100,Inf))