在 pandas 中找到 10 年来最高价值的 15 分钟
finding a 15minutes with highest value in 10 years in pandas
我有一个数据框,其中包含 15 分钟的观察值 10 年。
我想找到多年来价值最高的 15 分钟。
time_start count_id location_id obs
time
2000-03-07 07:30:00 2000-03-07 07:30:00-05:00 8182 3939 2.0
2000-03-07 07:45:00 2000-03-07 07:45:00-05:00 8182 3939 0.0
2000-03-07 08:00:00 2000-03-07 08:00:00-05:00 8182 3939 13.0
2000-03-07 08:15:00 2000-03-07 08:15:00-05:00 8182 3939 12.0
2000-03-07 08:30:00 2000-03-07 08:30:00-05:00 8182 3939 6.0
... ... ... ... ...
2000-03-01 17:45:00 2000-03-01 17:45:00-05:00 8193 5600 40.0
2000-01-11 07:30:00 2000-01-11 07:30:00-05:00 8194 5601 59.0
2000-01-11 07:45:00 2000-01-11 07:45:00-05:00 8194 5601 50.0
2000-01-11 08:00:00 2000-01-11 08:00:00-05:00 8194 5601 37.0
2000-01-11 08:15:00 2000-01-11 08:15:00-05:00 8194 5601 31.0
我使用以下代码创建了 24 小时内每 15 分钟的 10 年观测值 (obs) 平均值的直方图,最高 peds_sum 颜色最深。
counts_df = stationData10['obs'].groupby([stationData10.index.time, pd.Grouper(freq='15Min')]).mean().to_frame(name='n')
counts_df.rename_axis(['15Min', 'day'], inplace=True)
counts_df.hvplot.heatmap(title='Record count', x='15Min', y='day', C='n', width=FIGSIZE[0], height=FIGSIZE[1])
我解决了我的问题:
获得 8 小时的离散时间,
df1 = df.between_time('7:30','11:30')
df2 = df.between_time('13:00','14:45')
df3= df.between_time('16:00','17:45')
df_final=pd.DataFrame().append([df1,df2, df3])
然后,
total_df_final = df_final.groupby(['count_id','location_id'])['obs'].sum()
print ("AT the location{} on day of {}, the maximum observation of {} were recoded.".
format (total_df_final .idxmax()[1],
total_df_final .idxmax()[0],total_df_final .max()))
我有一个数据框,其中包含 15 分钟的观察值 10 年。 我想找到多年来价值最高的 15 分钟。
time_start count_id location_id obs
time
2000-03-07 07:30:00 2000-03-07 07:30:00-05:00 8182 3939 2.0
2000-03-07 07:45:00 2000-03-07 07:45:00-05:00 8182 3939 0.0
2000-03-07 08:00:00 2000-03-07 08:00:00-05:00 8182 3939 13.0
2000-03-07 08:15:00 2000-03-07 08:15:00-05:00 8182 3939 12.0
2000-03-07 08:30:00 2000-03-07 08:30:00-05:00 8182 3939 6.0
... ... ... ... ...
2000-03-01 17:45:00 2000-03-01 17:45:00-05:00 8193 5600 40.0
2000-01-11 07:30:00 2000-01-11 07:30:00-05:00 8194 5601 59.0
2000-01-11 07:45:00 2000-01-11 07:45:00-05:00 8194 5601 50.0
2000-01-11 08:00:00 2000-01-11 08:00:00-05:00 8194 5601 37.0
2000-01-11 08:15:00 2000-01-11 08:15:00-05:00 8194 5601 31.0
我使用以下代码创建了 24 小时内每 15 分钟的 10 年观测值 (obs) 平均值的直方图,最高 peds_sum 颜色最深。
counts_df = stationData10['obs'].groupby([stationData10.index.time, pd.Grouper(freq='15Min')]).mean().to_frame(name='n')
counts_df.rename_axis(['15Min', 'day'], inplace=True)
counts_df.hvplot.heatmap(title='Record count', x='15Min', y='day', C='n', width=FIGSIZE[0], height=FIGSIZE[1])
我解决了我的问题: 获得 8 小时的离散时间,
df1 = df.between_time('7:30','11:30')
df2 = df.between_time('13:00','14:45')
df3= df.between_time('16:00','17:45')
df_final=pd.DataFrame().append([df1,df2, df3])
然后,
total_df_final = df_final.groupby(['count_id','location_id'])['obs'].sum()
print ("AT the location{} on day of {}, the maximum observation of {} were recoded.".
format (total_df_final .idxmax()[1],
total_df_final .idxmax()[0],total_df_final .max()))