如何在 pandas 日期时间索引中获取所有间隔作为开始 .. 停止间隔
How to obtain all gaps as start .. stop interval in pandas datetime index
我想找到 pandas DateTime 索引中的所有间隔作为间隔列表。例如:
'2022-05-06 00:01:00'
'2022-05-06 00:02:00' <- Start of gap
'2022-05-06 00:06:00' <- End of gap
'2022-05-06 00:07:00'
'2022-05-06 00:08:00'
'2022-05-06 00:09:00' <- Next gap start
'2022-05-06 05:00:00' <- End
'2022-05-06 05:01:00'
我想下一个:
[('2022-05-06 00:03:00', '2022-05-06 00:05:00') ,
('2022-05-06 00:10:00', '2022-05-06 04:59:00')]
频率可以是任意的,但所有索引都相同。
IIUC 您可以计算差异以识别差距。使用掩码对开始和停止进行切片,并将它们 zip
作为列表。
# ensure datetime
df['datetime'] = pd.to_datetime(df['datetime'])
# threshold
t = pd.Timedelta('1min')
mask = df['datetime'].diff().gt(t)
# get values
starts = df.loc[mask.shift(-1, fill_value=False), 'datetime'].add(t).astype(str)
stops = df.loc[mask, 'datetime'].sub(t).astype(str)
# build output
out = list(zip(starts, stops))
输出:
[('2022-05-06 00:03:00', '2022-05-06 00:05:00'),
('2022-05-06 00:10:00', '2022-05-06 04:59:00')]
使用的输入:
datetime
0 2022-05-06 00:01:00
1 2022-05-06 00:02:00
2 2022-05-06 00:06:00
3 2022-05-06 00:07:00
4 2022-05-06 00:08:00
5 2022-05-06 00:09:00
6 2022-05-06 05:00:00
7 2022-05-06 05:01:00
我想找到 pandas DateTime 索引中的所有间隔作为间隔列表。例如:
'2022-05-06 00:01:00'
'2022-05-06 00:02:00' <- Start of gap
'2022-05-06 00:06:00' <- End of gap
'2022-05-06 00:07:00'
'2022-05-06 00:08:00'
'2022-05-06 00:09:00' <- Next gap start
'2022-05-06 05:00:00' <- End
'2022-05-06 05:01:00'
我想下一个:
[('2022-05-06 00:03:00', '2022-05-06 00:05:00') ,
('2022-05-06 00:10:00', '2022-05-06 04:59:00')]
频率可以是任意的,但所有索引都相同。
IIUC 您可以计算差异以识别差距。使用掩码对开始和停止进行切片,并将它们 zip
作为列表。
# ensure datetime
df['datetime'] = pd.to_datetime(df['datetime'])
# threshold
t = pd.Timedelta('1min')
mask = df['datetime'].diff().gt(t)
# get values
starts = df.loc[mask.shift(-1, fill_value=False), 'datetime'].add(t).astype(str)
stops = df.loc[mask, 'datetime'].sub(t).astype(str)
# build output
out = list(zip(starts, stops))
输出:
[('2022-05-06 00:03:00', '2022-05-06 00:05:00'),
('2022-05-06 00:10:00', '2022-05-06 04:59:00')]
使用的输入:
datetime
0 2022-05-06 00:01:00
1 2022-05-06 00:02:00
2 2022-05-06 00:06:00
3 2022-05-06 00:07:00
4 2022-05-06 00:08:00
5 2022-05-06 00:09:00
6 2022-05-06 05:00:00
7 2022-05-06 05:01:00