matplotlib - 消除时间序列数据图中的时间间隔?

matplotlib - removing time gaps in time-series data plots?

我有一组数据将时间值与另一个值(例如海拔)相匹配。我目前正在 matplotlib 中绘制它,它看起来像这样:

可以看出,数据中有大'gaps'个,中间有连线,因为绘制的时间不是紧挨着的。

我的代码看起来像这样(我没有pandas):

time_sorted_list = sorted(
    unsorted_value_list, key=lambda x: x.time
)

elev = [i.elev for i in time_sorted_list]

time = [i.time for i in time_sorted_list]

...(creating elev_plot figure)

elev_plot.plot(time, elev)

elev_plot.grid()
plot.show()

如何设置 matplotlib 以消除这些较大的时间间隔,使所有值彼此靠近?我不想完全删除时间 - 知道每个高程值何时出现很重要。

示例数据:

海拔:

[7.061637017210896, 8.62634035986128, 9.449231409579046, 9.449245213599722, 11.183401391828983, 11.183478912151985, 12.097695062804538, 14.032121063226736, 19.53103255309029, 20.132430448781705, 22.61562154333468, 23.892538058003574, 25.174568988146742, 25.81347252259264, 27.07766665010065, 28.301824218809962, 29.4560748154805, 30.51425894250495, 31.44003996067941, 32.19935454662037, 32.75797351858856, 33.09230892539046, 33.185638377860386, 32.64289077682021, 32.64282073446187, 32.03439364065985, 32.03432718356379, 31.235743890788736, 30.278072995085186, 29.198208966807904, 28.02762496428912, 25.534718034319297, 24.259335234095236, 22.987561974637945, 21.733969026630948, 20.50551578656278, 19.30698140187512, 18.145822000390414, 17.021157410685678, 14.89032761900031, 13.881534452146786, 12.910228441720443, 11.9735858799619, 10.20078824064575, 8.548230021876677, 7.7622314951825935, 7.002108526108933, 6.2652418436101245, 5.550265750342097, 4.180538242033181, 3.523953356314147, 10.468976986513358, 10.826799614265274, 11.548804997129018, 15.198784031309774, 15.913277577899912, 16.609161706884624, 18.52422058507705, 19.077032064883326, 19.57286148977654, 20.002244208317894, 20.91143576667658, 20.91127829131031, 20.911272488234292, 20.817089892791472, 20.630717861747748, 20.630698531303153, 20.35705893695184, 20.357030796826844, 20.001375885553323, 19.571700419702164, 19.075697249423904, 17.921110661414083, 15.911217262744842, 15.197579232135709, 14.472858366158526, 13.740784521720999, 13.007405649336956]

时间(存储为日期时间):

['08/26/2021, 08:28:28', '08/26/2021, 08:28:48', '08/26/2021, 08:28:58', '08/26/2021, 08:28:58', '08/26/2021, 08:29:18', '08/26/2021, 08:29:18', '08/26/2021, 08:29:28', '08/26/2021, 08:29:48', '08/26/2021, 08:30:38', '08/26/2021, 08:30:43', '08/26/2021, 08:31:03', '08/26/2021, 08:31:13', '08/26/2021, 08:31:23', '08/26/2021, 08:31:28', '08/26/2021, 08:31:38', '08/26/2021, 08:31:48', '08/26/2021, 08:31:58', '08/26/2021, 08:32:08', '08/26/2021, 08:32:18', '08/26/2021, 08:32:28', '08/26/2021, 08:32:38', '08/26/2021, 08:32:48', '08/26/2021, 08:32:58', '08/26/2021, 08:33:18', '08/26/2021, 08:33:18', '08/26/2021, 08:33:28', '08/26/2021, 08:33:28', '08/26/2021, 08:33:38', '08/26/2021, 08:33:48', '08/26/2021, 08:33:58', '08/26/2021, 08:34:08', '08/26/2021, 08:34:28', '08/26/2021, 08:34:38', '08/26/2021, 08:34:48', '08/26/2021, 08:34:58', '08/26/2021, 08:35:08', '08/26/2021, 08:35:18', '08/26/2021, 08:35:28', '08/26/2021, 08:35:38', '08/26/2021, 08:35:58', '08/26/2021, 08:36:08', '08/26/2021, 08:36:18', '08/26/2021, 08:36:28', '08/26/2021, 08:36:48', '08/26/2021, 08:37:08', '08/26/2021, 08:37:18', '08/26/2021, 08:37:28', '08/26/2021, 08:37:38', '08/26/2021, 08:37:48', '08/26/2021, 08:38:08', '08/26/2021, 08:38:18', '08/26/2021, 10:11:00', '08/26/2021, 10:11:05', '08/26/2021, 10:11:15', '08/26/2021, 10:12:05', '08/26/2021, 10:12:15', '08/26/2021, 10:12:25', '08/26/2021, 10:12:55', '08/26/2021, 10:13:05', '08/26/2021, 10:13:15', '08/26/2021, 10:13:25', '08/26/2021, 10:14:05', '08/26/2021, 10:14:15', '08/26/2021, 10:14:15', '08/26/2021, 10:14:25', '08/26/2021, 10:14:35', '08/26/2021, 10:14:35', '08/26/2021, 10:14:45', '08/26/2021, 10:14:45', '08/26/2021, 10:14:55', '08/26/2021, 10:15:05', '08/26/2021, 10:15:15', '08/26/2021, 10:15:35', '08/26/2021, 10:16:05', '08/26/2021, 10:16:15', '08/26/2021, 10:16:25', '08/26/2021, 10:16:35', '08/26/2021, 10:16:45']

IIUC,你想去除峰之间的大间隙吗?

使用 pandas

更新了答案

我把步骤分解成独立的专栏让你看看逻辑

df = pd.DataFrame({'time': time, 'elev': elev})
df['time']   = pd.to_datetime(df['time'])         
df['delta']  = df['time'].diff()              # diff from previous time
df['gap']    = df['delta'].dt.seconds.gt(100) # gap = diff > 100 seconds
df['group']  = df['gap'].cumsum()             # make groups


groups = df.groupby('group')
f, ax = plt.subplots(ncols=len(groups), sharey=True)
for i, g in groups:
    ax[i].plot(g['time'], g['elev'])
    start = g['time'].iloc[0].time()
    stop = g['time'].iloc[-1].time()
    ax[i].set_title(f'group {i+1}\n({start}--{stop})')

输出:

旧答案

原始数据:

from datetime import datetime
elev = [7.061637017210896, 8.62634035986128, 9.449231409579046, 9.449245213599722, 11.183401391828983, 11.183478912151985, 12.097695062804538, 14.032121063226736, 19.53103255309029, 20.132430448781705, 22.61562154333468, 23.892538058003574, 25.174568988146742, 25.81347252259264, 27.07766665010065, 28.301824218809962, 29.4560748154805, 30.51425894250495, 31.44003996067941, 32.19935454662037, 32.75797351858856, 33.09230892539046, 33.185638377860386, 32.64289077682021, 32.64282073446187, 32.03439364065985, 32.03432718356379, 31.235743890788736, 30.278072995085186, 29.198208966807904, 28.02762496428912, 25.534718034319297, 24.259335234095236, 22.987561974637945, 21.733969026630948, 20.50551578656278, 19.30698140187512, 18.145822000390414, 17.021157410685678, 14.89032761900031, 13.881534452146786, 12.910228441720443, 11.9735858799619, 10.20078824064575, 8.548230021876677, 7.7622314951825935, 7.002108526108933, 6.2652418436101245, 5.550265750342097, 4.180538242033181, 3.523953356314147, 10.468976986513358, 10.826799614265274, 11.548804997129018, 15.198784031309774, 15.913277577899912, 16.609161706884624, 18.52422058507705, 19.077032064883326, 19.57286148977654, 20.002244208317894, 20.91143576667658, 20.91127829131031, 20.911272488234292, 20.817089892791472, 20.630717861747748, 20.630698531303153, 20.35705893695184, 20.357030796826844, 20.001375885553323, 19.571700419702164, 19.075697249423904, 17.921110661414083, 15.911217262744842, 15.197579232135709, 14.472858366158526, 13.740784521720999, 13.007405649336956]
time = ['08/26/2021, 08:28:28', '08/26/2021, 08:28:48', '08/26/2021, 08:28:58', '08/26/2021, 08:28:58', '08/26/2021, 08:29:18', '08/26/2021, 08:29:18', '08/26/2021, 08:29:28', '08/26/2021, 08:29:48', '08/26/2021, 08:30:38', '08/26/2021, 08:30:43', '08/26/2021, 08:31:03', '08/26/2021, 08:31:13', '08/26/2021, 08:31:23', '08/26/2021, 08:31:28', '08/26/2021, 08:31:38', '08/26/2021, 08:31:48', '08/26/2021, 08:31:58', '08/26/2021, 08:32:08', '08/26/2021, 08:32:18', '08/26/2021, 08:32:28', '08/26/2021, 08:32:38', '08/26/2021, 08:32:48', '08/26/2021, 08:32:58', '08/26/2021, 08:33:18', '08/26/2021, 08:33:18', '08/26/2021, 08:33:28', '08/26/2021, 08:33:28', '08/26/2021, 08:33:38', '08/26/2021, 08:33:48', '08/26/2021, 08:33:58', '08/26/2021, 08:34:08', '08/26/2021, 08:34:28', '08/26/2021, 08:34:38', '08/26/2021, 08:34:48', '08/26/2021, 08:34:58', '08/26/2021, 08:35:08', '08/26/2021, 08:35:18', '08/26/2021, 08:35:28', '08/26/2021, 08:35:38', '08/26/2021, 08:35:58', '08/26/2021, 08:36:08', '08/26/2021, 08:36:18', '08/26/2021, 08:36:28', '08/26/2021, 08:36:48', '08/26/2021, 08:37:08', '08/26/2021, 08:37:18', '08/26/2021, 08:37:28', '08/26/2021, 08:37:38', '08/26/2021, 08:37:48', '08/26/2021, 08:38:08', '08/26/2021, 08:38:18', '08/26/2021, 10:11:00', '08/26/2021, 10:11:05', '08/26/2021, 10:11:15', '08/26/2021, 10:12:05', '08/26/2021, 10:12:15', '08/26/2021, 10:12:25', '08/26/2021, 10:12:55', '08/26/2021, 10:13:05', '08/26/2021, 10:13:15', '08/26/2021, 10:13:25', '08/26/2021, 10:14:05', '08/26/2021, 10:14:15', '08/26/2021, 10:14:15', '08/26/2021, 10:14:25', '08/26/2021, 10:14:35', '08/26/2021, 10:14:35', '08/26/2021, 10:14:45', '08/26/2021, 10:14:45', '08/26/2021, 10:14:55', '08/26/2021, 10:15:05', '08/26/2021, 10:15:15', '08/26/2021, 10:15:35', '08/26/2021, 10:16:05', '08/26/2021, 10:16:15', '08/26/2021, 10:16:25', '08/26/2021, 10:16:35', '08/26/2021, 10:16:45']
time = [datetime.strptime(t, '%m/%d/%Y, %H:%M:%S') for t in time]
ax = plt.subplot()
ax.plot(time, elev, marker='.')

问题是点的间距不均匀:

   interval  count
0      10.0     50
1      20.0     11
2       0.0      7
3       5.0      3
4      50.0      2
5      30.0      2
6      40.0      1
7    5562.0      1

一个简单的解决方案是在不保留时间信息的情况下以固定步长进行绘图:

ax = plt.subplot()
ax.plot(range(len(elev)), elev, marker='.')

还有其他方法,但更复杂,所以如果快速解决方案不适合您,请明确您的要求。