对于 pandas 数据框中的循环,如何按每小时范围内的天数进行过滤并关联一个值?
For loops in pandas dataframe, how to filter by days of hourly range and associate a value?
我有一个名为 df_sub 的数据框,如下所示:
date open high low close volume
405 2022-01-03 08:00:00 4293.5 4295.5 4291.5
406 2022-01-03 08:01:00 4294.0 4295.5 4294.0
407 2022-01-03 08:02:00 4295.5 4297.5 4295.5
408 2022-01-03 08:03:00 4297.0 4298.0 4296.0
409 2022-01-03 08:04:00 4296.5 4296.5 4295.0
... ... ... ... ... ... ... ... ...
5460 2022-01-07 08:55:00 4311.0 4312.0 4310.5
5461 2022-01-07 08:56:00 4311.5 4311.5 4311.0
5462 2022-01-07 08:57:00 4311.0 4312.0 4310.0
我需要创建这种类型的循环:
for row in df_sub:
take a single day (so, in this case 2022-01-03, 04...07) and create a column with df_sub["high"].max() value,
so i will have the maximum value of the high in all the rows of the same day,
naturally, this implies that in other day the maximum value will be different from the
previews one because the high will be different.
您可以使用 resample
:
df_sub=df_sub.set_index('date')
df_new=df_sub.resample('d')['high'].max()
我有一个名为 df_sub 的数据框,如下所示:
date open high low close volume
405 2022-01-03 08:00:00 4293.5 4295.5 4291.5
406 2022-01-03 08:01:00 4294.0 4295.5 4294.0
407 2022-01-03 08:02:00 4295.5 4297.5 4295.5
408 2022-01-03 08:03:00 4297.0 4298.0 4296.0
409 2022-01-03 08:04:00 4296.5 4296.5 4295.0
... ... ... ... ... ... ... ... ...
5460 2022-01-07 08:55:00 4311.0 4312.0 4310.5
5461 2022-01-07 08:56:00 4311.5 4311.5 4311.0
5462 2022-01-07 08:57:00 4311.0 4312.0 4310.0
我需要创建这种类型的循环:
for row in df_sub:
take a single day (so, in this case 2022-01-03, 04...07) and create a column with df_sub["high"].max() value,
so i will have the maximum value of the high in all the rows of the same day,
naturally, this implies that in other day the maximum value will be different from the
previews one because the high will be different.
您可以使用 resample
:
df_sub=df_sub.set_index('date')
df_new=df_sub.resample('d')['high'].max()