如何绘制多个每日时间序列,在指定的触发时间对齐?

How to plot multiple daily time series, aligned at specified trigger times?

问题:

我有一个数据框 df 如下所示:

                                  value  msg_type
date        
2022-03-15 08:15:10+00:00         122    None
2022-03-15 08:25:10+00:00         125    None
2022-03-15 08:30:10+00:00         126    None
2022-03-15 08:30:26.542134+00:00  127    ANNOUNCEMENT
2022-03-15 08:35:10+00:00         128    None
2022-03-15 08:40:10+00:00         122    None
2022-03-15 08:45:09+00:00         127    None
2022-03-15 08:50:09+00:00         133    None
2022-03-15 08:55:09+00:00         134    None
....
2022-03-16 09:30:09+00:00         132    None
2022-03-16 09:30:13.234425+00:00  135    ANNOUNCEMENT
2022-03-16 09:35:09+00:00         130    None
2022-03-16 09:40:09+00:00         134    None
2022-03-16 09:45:09+00:00         135    None
2022-03-16 09:50:09+00:00         134    None

value 数据大约每隔 5 分钟出现一次,但消息随时可能出现。我正在尝试每天绘制一条 values 线,其中 x 轴的范围从 t=-2 小时到 t=+8 小时,并且 ANNOUNCEMENT 发生在 t=0(见图下面)。

因此,例如,如果 ANNOUNCEMENT 发生在 3 月 15 日的 8:30AM 并再次发生在 3 月 16 日的 9:30AM,则应该有两行:

两者共享相同的 x 轴,范围从 -2 到 +8,ANNOUNCEMENT 在 t=0。


我尝试过的:

我目前可以通过查找公告的索引位置(例如,它出现在第 298 行 -> announcement_index = 298)来执行此操作,生成从 -24 到 96 的 120 个数字的数组(代表每个数字 5 分钟 10 小时 -> x = np.arange(-24, 96, 1)),然后绘制

sns.lineplot(x, y=df['value'].iloc[announcement_index-24:announcement_index+96])

虽然这大部分都有效(见下图),但我怀疑这不是正确的方法。具体来说,尝试在特定时间向绘图添加更多信息(如一组不同的 'value' 标记)很困难,因为我需要将时间戳转换为这个任意的 24-96 值范围。

如何使用日期时间索引来制作相同的图?非常感谢!

假设索引已经to_datetime, create an IntervalArray从索引的-2H转换为+8H:

dl, dr = -2, 8
left = df.index + pd.Timedelta(f'{dl}H')
right = df.index + pd.Timedelta(f'{dr}H')

df['interval'] = pd.arrays.IntervalArray.from_arrays(left, right)

然后对于每个 ANNOUNCEMENT,绘制从 interval.leftinterval.right 的 window:

  • 将 x-axis 设置为自 ANNOUNCEMENT
  • 以来的秒数
  • 将标签设置为自 ANNOUNCEMENT
  • 以来的小时数
fig, ax = plt.subplots()
for ann in df.loc[df['msg_type'] == 'ANNOUNCEMENT'].itertuples():
    window = df.loc[ann.interval.left:ann.interval.right] # extract interval.left to interval.right
    window.index -= ann.Index                             # compute time since announcement
    window.index = window.index.total_seconds()           # convert to seconds since announcement

    window.plot(ax=ax, y='value', label=ann.Index.date())
    deltas = np.arange(dl, dr + 1)
    ax.set(xticks=deltas * 3600, xticklabels=deltas)      # set tick labels to hours since announcement

ax.legend()

这里是一个较小的 window -1H 到 +2H 的输出,这样我们可以更清楚地看到小样本数据(下面的完整代码):

完整代码:

import io
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

s = '''
date,value,msg_type
2022-03-15 08:15:10+00:00,122,None
2022-03-15 08:25:10+00:00,125,None
2022-03-15 08:30:10+00:00,126,None
2022-03-15 08:30:26.542134+00:00,127,ANNOUNCEMENT
2022-03-15 08:35:10+00:00,128,None
2022-03-15 08:40:10+00:00,122,None
2022-03-15 08:45:09+00:00,127,None
2022-03-15 08:50:09+00:00,133,None
2022-03-15 08:55:09+00:00,134,None
2022-03-16 09:30:09+00:00,132,None
2022-03-16 09:30:13.234425+00:00,135,ANNOUNCEMENT
2022-03-16 09:35:09+00:00,130,None
2022-03-16 09:40:09+00:00,134,None
2022-03-16 09:45:09+00:00,135,None
2022-03-16 09:50:09+00:00,134,None
'''
df = pd.read_csv(io.StringIO(s), index_col=0, parse_dates=['date'])

# create intervals from -1H to +2H of the index
dl, dr = -1, 2
left = df.index + pd.Timedelta(f'{dl}H')
right = df.index + pd.Timedelta(f'{dr}H')
df['interval'] = pd.arrays.IntervalArray.from_arrays(left, right)

# plot each announcement's interval.left to interval.right
fig, ax = plt.subplots()
for ann in df.loc[df['msg_type'] == 'ANNOUNCEMENT')].itertuples():
    window = df.loc[ann.interval.left:ann.interval.right] # extract interval.left to interval.right
    window.index -= ann.Index                             # compute time since announcement
    window.index = window.index.total_seconds()           # convert to seconds since announcement

    window.plot(ax=ax, y='value', label=ann.Index.date())
    deltas = np.arange(dl, dr + 1)
    ax.set(xticks=deltas * 3600, xticklabels=deltas)      # set tick labels to hours since announcement

ax.grid()
ax.legend()