使用 df.apply 和 if 语句根据一天中的小时更改 pandas 系列中的值

Question

我有一个很大的 df 和 datettime 索引，在几列中有每小时的时间步长和降水值。我的降水量值是一天中的累计值（从1:00上午到第二天0:00上午）并且每天都会重置，例如：

datetime                      S1                                                                        
2000-01-01 00:00:00          4.5  ...  
2000-01-01 01:00:00            0  ...  
2000-01-01 02:00:00            0  ...  
2000-01-01 03:00:00            0  ...  
2000-01-01 04:00:00            0
2000-01-01 05:00:00            0
2000-01-01 06:00:00            0
2000-01-01 07:00:00            0
2000-01-01 08:00:00            0
2000-01-01 09:00:00            0
2000-01-01 10:00:00            0
2000-01-01 11:00:00          6.5
2000-01-01 12:00:00          7.5
2000-01-01 13:00:00          8.7
2000-01-01 14:00:00          8.7
...
2000-01-01 22:00:00          8.7
2000-01-01 23:00:00          8.7
2000-01-02 00:00:00          8.7
2000-01-02 01:00:00            0

我试图从这个到实际的每小时值，所以 1:00 每天的值都很好，然后我想从之前的时间步长中减去该值。我能以某种方式在 df.apply 中使用 if 语句吗？我想到了这样的事情：

df_copy = df.copy()
df = df.apply(lambda x: if df.hour !=1: era5_T[x]=era5_T[x]-era5_T_copy[x-1])

但这不起作用，因为我没有调用函数？我可以使用 for 循环，但这似乎不是最有效的方法，因为我正在处理一个大数据集。

Answer 1

您可以使用 numpy.where 和 pd.Series.shift 来实现结果

import numpy as np
df['hourly_S1'] = np.where(df.hour ==1, df.S1, df.S1-df.S1.shift())

使用 df.apply 和 if 语句根据一天中的小时更改 pandas 系列中的值

Change value in pandas series based on hour of the day using df.apply and if statement

python

datetime

series

apply

pandas