pandas groupby 中的列总和
Column sum in pandas groupby
下面是数据框
Skill Category Location Market Type Count
Java Cat1 Europe Tier1 A 2
Java Cat1 Europe Tier1 B 1
Java Cat1 Europe Tier1 C 1
Java Cat2 Asia Tier2 D 1
Java Cat3 Asia Tier1 E 1
下面是预期的输出数据帧
Skill Category Location Market Type Count Sum_Market
Java Cat1 Europe Tier1 A 2 4
Java Cat1 Europe Tier1 B 1 4
Java Cat1 Europe Tier1 C 1 4
Java Cat2 Asia Tier2 D 1 1
Java Cat3 Asia Tier1 E 1 1
问题陈述:Sum_Market 应该在每个选择中使用特定技能、类别、位置的分组以及市场层级的总和来完成。
以下是我的尝试:
df.groupby(['Skill','Category','Location','Market','Type'])['count'].sum()
只需合并回原来的:
df.merge(
df.groupby(['Skill','Category','Location','Market','Type'])['count'].sum().rename('Sum_Market').reset_index()
)
使用
df['Sum_Market'] = df.groupby(['Skill','Category','Location'])['Count'].transform('sum')
OUTPUT
Skill Category Location Market Type Count Sum_Market
0 Java Cat1 Europe Tier1 A 2 4
1 Java Cat1 Europe Tier1 B 1 4
2 Java Cat1 Europe Tier1 C 1 4
3 Java Cat2 Asia Tier2 D 1 1
4 Java Cat3 Asia Tier1 E 1 1
下面是数据框
Skill Category Location Market Type Count
Java Cat1 Europe Tier1 A 2
Java Cat1 Europe Tier1 B 1
Java Cat1 Europe Tier1 C 1
Java Cat2 Asia Tier2 D 1
Java Cat3 Asia Tier1 E 1
下面是预期的输出数据帧
Skill Category Location Market Type Count Sum_Market
Java Cat1 Europe Tier1 A 2 4
Java Cat1 Europe Tier1 B 1 4
Java Cat1 Europe Tier1 C 1 4
Java Cat2 Asia Tier2 D 1 1
Java Cat3 Asia Tier1 E 1 1
问题陈述:Sum_Market 应该在每个选择中使用特定技能、类别、位置的分组以及市场层级的总和来完成。 以下是我的尝试:
df.groupby(['Skill','Category','Location','Market','Type'])['count'].sum()
只需合并回原来的:
df.merge(
df.groupby(['Skill','Category','Location','Market','Type'])['count'].sum().rename('Sum_Market').reset_index()
)
使用
df['Sum_Market'] = df.groupby(['Skill','Category','Location'])['Count'].transform('sum')
OUTPUT
Skill Category Location Market Type Count Sum_Market
0 Java Cat1 Europe Tier1 A 2 4
1 Java Cat1 Europe Tier1 B 1 4
2 Java Cat1 Europe Tier1 C 1 4
3 Java Cat2 Asia Tier2 D 1 1
4 Java Cat3 Asia Tier1 E 1 1