基于 ID 的同比增长

YOY growth based on ID

我正在尝试计算 Pandas 数据框中变量的年增长率。我的数据如下所示:

Year Country Industry Value
2000 USA Manufacturing 5
2000 Mexico Manufacturing 10
2001 Mexico Manufacturing 15
2002 Mexico Other 20

根据国家或行业的不同,我有不同数量的观察结果。预期输出:

Year Country Industry Value YOY
2000 USA Manufacturing 5 NaN
2000 Mexico Manufacturing 10 NaN
2001 Mexico Manufacturing 15 50%
2002 Mexico Other 20 NaN

我尝试了不同的方法,包括:

df.groupby(['Country','Industry','Year'])['Value'].pct_change()

df['YOY'] = (df['Value'] - df.sort_values(by=['Country','Industry','Year']).groupby(['Country','Industry'])['Value'].shift(1))) / df['Value']

第一行计算行之间的增长,而不为新的国家或行业重新设置。第二个结果不连贯。

有什么线索可以帮助我吗?谢谢!!

试试这个:

df['YOY'] = df.groupby(['Country','Industry'])['Value'].pct_change().mul(100)

输出:

>>> df
   Year Country       Industry  Value   YOY
0  2000     USA  Manufacturing      5   NaN
1  2000  Mexico  Manufacturing     10   NaN
2  2001  Mexico  Manufacturing     15  50.0
3  2002  Mexico          Other     20   NaN