基于 ID 的同比增长
YOY growth based on ID
我正在尝试计算 Pandas 数据框中变量的年增长率。我的数据如下所示:
Year
Country
Industry
Value
2000
USA
Manufacturing
5
2000
Mexico
Manufacturing
10
2001
Mexico
Manufacturing
15
2002
Mexico
Other
20
根据国家或行业的不同,我有不同数量的观察结果。预期输出:
Year
Country
Industry
Value
YOY
2000
USA
Manufacturing
5
NaN
2000
Mexico
Manufacturing
10
NaN
2001
Mexico
Manufacturing
15
50%
2002
Mexico
Other
20
NaN
我尝试了不同的方法,包括:
df.groupby(['Country','Industry','Year'])['Value'].pct_change()
df['YOY'] = (df['Value'] - df.sort_values(by=['Country','Industry','Year']).groupby(['Country','Industry'])['Value'].shift(1))) / df['Value']
第一行计算行之间的增长,而不为新的国家或行业重新设置。第二个结果不连贯。
有什么线索可以帮助我吗?谢谢!!
试试这个:
df['YOY'] = df.groupby(['Country','Industry'])['Value'].pct_change().mul(100)
输出:
>>> df
Year Country Industry Value YOY
0 2000 USA Manufacturing 5 NaN
1 2000 Mexico Manufacturing 10 NaN
2 2001 Mexico Manufacturing 15 50.0
3 2002 Mexico Other 20 NaN
我正在尝试计算 Pandas 数据框中变量的年增长率。我的数据如下所示:
Year | Country | Industry | Value |
---|---|---|---|
2000 | USA | Manufacturing | 5 |
2000 | Mexico | Manufacturing | 10 |
2001 | Mexico | Manufacturing | 15 |
2002 | Mexico | Other | 20 |
根据国家或行业的不同,我有不同数量的观察结果。预期输出:
Year | Country | Industry | Value | YOY |
---|---|---|---|---|
2000 | USA | Manufacturing | 5 | NaN |
2000 | Mexico | Manufacturing | 10 | NaN |
2001 | Mexico | Manufacturing | 15 | 50% |
2002 | Mexico | Other | 20 | NaN |
我尝试了不同的方法,包括:
df.groupby(['Country','Industry','Year'])['Value'].pct_change()
df['YOY'] = (df['Value'] - df.sort_values(by=['Country','Industry','Year']).groupby(['Country','Industry'])['Value'].shift(1))) / df['Value']
第一行计算行之间的增长,而不为新的国家或行业重新设置。第二个结果不连贯。
有什么线索可以帮助我吗?谢谢!!
试试这个:
df['YOY'] = df.groupby(['Country','Industry'])['Value'].pct_change().mul(100)
输出:
>>> df
Year Country Industry Value YOY
0 2000 USA Manufacturing 5 NaN
1 2000 Mexico Manufacturing 10 NaN
2 2001 Mexico Manufacturing 15 50.0
3 2002 Mexico Other 20 NaN