Pandas DataFrame GroupBy sum/count 到新的 DataFrame
Pandas DataFrame GroupBy sum/count to new DataFrame
我的数据框是
State|City|Year|Budget|Income
S1|C1|2000|1000|1
S1|C2|2000|1200|2
S2|C3|2000|5500|3
我需要获取一个包含列的新 DataFrame:
State, Year, Count, Sum_Budget, Sum_Income:
即
State|Year|Count|Sum_Budget|Sum_Income
S1|2000|2|2200|3
S2|2000|1|5500|3
在 C# 中,代码为:
dataframe
.GroupBy(x => new { x.State, x.City})
.Select(x => new {
x.Key.State,
x.Key.City,
Count = x.Count(),
Sum_Budget = x.Sum(y => y.Budget),
Sum_Income= x.Sum(y => y.Income)
}
}).ToArray();
如何使用 Pandas 进行此操作?
使用agg
:
d = {'Income':'Sum_Income','Budget':'Sum_Budget','City':'Count'}
agg_d = {'Budget':'sum', 'Income':'sum', 'City':'size'}
df = df.groupby(['State', 'Year'], as_index=False).agg(agg_d).rename(columns=d)
print (df)
State Year Sum_Income Sum_Budget Count
0 S1 2000 3 2200 2
1 S2 2000 3 5500 1
我的数据框是
State|City|Year|Budget|Income
S1|C1|2000|1000|1
S1|C2|2000|1200|2
S2|C3|2000|5500|3
我需要获取一个包含列的新 DataFrame:
State, Year, Count, Sum_Budget, Sum_Income:
即
State|Year|Count|Sum_Budget|Sum_Income
S1|2000|2|2200|3
S2|2000|1|5500|3
在 C# 中,代码为:
dataframe
.GroupBy(x => new { x.State, x.City})
.Select(x => new {
x.Key.State,
x.Key.City,
Count = x.Count(),
Sum_Budget = x.Sum(y => y.Budget),
Sum_Income= x.Sum(y => y.Income)
}
}).ToArray();
如何使用 Pandas 进行此操作?
使用agg
:
d = {'Income':'Sum_Income','Budget':'Sum_Budget','City':'Count'}
agg_d = {'Budget':'sum', 'Income':'sum', 'City':'size'}
df = df.groupby(['State', 'Year'], as_index=False).agg(agg_d).rename(columns=d)
print (df)
State Year Sum_Income Sum_Budget Count
0 S1 2000 3 2200 2
1 S2 2000 3 5500 1