Pandas Groupby Season and Year Average列

Pandas Groupby Season and Year Average Column

我有一个 df“ncData”,如下所示,我正在尝试按季节(冬季、Spring、夏季、秋季)对数据进行分组,并取风速的平均值和每个 windfarm_name 每年每个季节的月份的功率列。这是 ncData 的前几行:

ncData.head(2)
Out[432]: 
     site_name windfarm_name region_name                      time  \
4055     REDCK    Red Creek   Northeast 2019-12-28 20:00:00+00:00   
4056     REDCK    Red Creek   Northeast 2019-12-28 19:00:00+00:00   

      wind_speed    power       Dates     Hours  year month day  Season  
4055     5.89692  23.9702  2019-12-28  20:00:00  2019    12  28  Winter  
4056     4.75525  13.8225  2019-03-28  19:00:00  2019     3  28  Spring 

我试过类似的东西:

ncData.groupby([pd.Grouper(key='Season', freq='1Y'),pd.Grouper(key='windfarm_name')]).mean()

出现此错误:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 
'Index'

而且,我试过这样:

ncData.groupby(['Season','windfarm_name'],freq='1Y')['wind_speed'].mean()

我需要这样的输出:

         time       windfarm_name  season         wind_speed power
0    1991          Red Creek      winter         3.917762   8.276560
1    1991          Red Creek      spring         3.046854   0.132271
2    1991          Red Creek      summer         3.737426   6.799836
3    1991          Red Creek      autumn         3.870350   4.010200
4    1991         Oasis Wind      winter         2.955412   2.898962
5    1991         Oasis Wind      spring         2.707168   0.076643

谢谢!

你几乎成功了

ncData.groupby(['year', 'windfarm_name', 'Season'])['wind_speed', 'power'].mean()

请注意,您可以不将时间列拆分为年、月、日。只要确保它是 DateTime

类型
ncData.groupby([ncData['time'].month, 'windfarm_name', 'Season'])['wind_speed', 'power'].mean()