在 R 中随时间绘制事件频率

Plot frequency of events with time in R

我有相当多的时间数据,我想把它放在一个频率图中,其中 X 轴是几个时间间隔,Y 轴是我的数据量在这样的时期收集。看这个例子:

假设我有这个列表:

[10:17:55, 10:37:40, 10:40:26, 10:48:18, 11:00:17, 11:01:12, 11:06:58, 11:09:20, 11:43:41, 11:48:24, 11:49:14, 12:07:31, 12:10:52, 12:10:52, 12:19:00, 12:19:00, 12:19:43, 12:20:55, 12:38:27, 12:38:27, 12:55:09, 12:55:10, 12:57:31, 12:57:31, 13:04:16, 13:04:16, 13:06:51   13:06:51, 14:55:06, 14:56:10, 15:01:30, 15:28:42, 3:29:17, 15:35:33, 15:58:32, 16:05:07, 16:09:16, 16:10:36, 16:32:57, 16:32:57, 16:34:32, 16:38:16, 17:43:27, 17:53:01, 17:56:14, 18:08:21, 18:17:23, 18:37:23, 18:37:23, 18:43:13, 18:43:13   18:51:43, 18:51:43, 19:05:39, 19:05:39]

我想绘制一个直方图,显示在 1 小时或 30 分钟(仍在决定)的间隔内有多少值,例如:

10h - 11h: 4
11h - 12h: 7
.
.
.
19h - 20h: 2

但所有这些都在图表中表示。我知道如何在 R 中绘制直方图的基础知识,但不知道该怎么做。我看到一些答案整天都在作图,这不太适用,因为这些值是在不同的日子里收集的……你们能帮帮我吗?

编辑:这是列表的 dput()

structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 
13L, 13L, 14L, 14L, 15L, 16L, 17L, 17L, 18L, 19L, 20L, 20L, 21L, 
21L, 22L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 
33L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 41L, 42L, 42L, 
43L, 43L, 44L, 44L), .Label = c("10:17:55", "10:37:40", "10:40:26", 
"10:48:18", "11:00:17", "11:01:12", "11:06:58", "11:09:20", "11:43:41", 
"11:48:24", "11:49:14", "12:07:31", "12:10:52", "12:19:00", "12:19:43", 
"12:20:55", "12:38:27", "12:55:09", "12:55:10", "12:57:31", "13:04:16", 
"13:06:51", "14:55:06", "14:56:10", "15:01:30", "15:28:42", "15:29:17", 
"15:35:33", "15:58:32", "16:05:07", "16:09:16", "16:10:36", "16:32:57", 
"16:34:32", "16:38:16", "17:43:27", "17:53:01", "17:56:14", "18:08:21", 
"18:17:23", "18:37:23", "18:43:13", "18:51:43", "19:05:39"), class = "factor")`

这是我用来获取你想要的东西的方法。

这将工作几个小时和半小时。不是最漂亮的,但我认为它符合您的目的。您需要对轴进行一些按摩,以便它们显示您想要的信息。希望对您有所帮助!

hours <- as.numeric( format( strptime( times , format = "%H:%M:%S" ) , "%H" ) )
hist( hours , breaks = unique( hours ) )

half_hours <- hours + ( as.numeric( format( strptime( times , format = "%H:%M:%S" ) , "%M" ) ) /60 )
hist(half_hours , breaks = c( unique( hours ) , unique( hours ) + 0.5 ) )

POSIXt 或 Date 对象有 range、trunc 和 seq 方法。假设您将该结构对象分配给诸如 tms 之类的名称,这将转换为 POSIXct,然后构造一个范围,一系列跨越数小时的中断,然后在 30 分钟的间隔内分箱:

> tms <- as.POSIXct(tms, format="%H:%M:%S")
> brks <- trunc(range(tms), "hours")
Warning message:
In if (isdst == -1) { :
  the condition has length > 1 and only the first element will be used
> hist(tms, breaks=seq(brks[1], brks[2]+3600, by="30 min") )

注意 POSIXt 对象的 plot 方法处理 x 轴标签:

我想你可以检查第二个 "brks" 是否在半小时内 window 30 分钟的情节。所以这将是避免空白箱的代码,如果目标是半小时箱:

hist(tms, breaks=seq(brks[1], 
                     brks[2]+ if( as.numeric( max(tms)-brks[2] ) < 30) #diff time in mins
                                   {1800} else{3600},
                    by="30 min")
    )