Pandas:摘要数据框中的多个总计
Pandas: Multiple Grand Totals in a Summary Dataframe
对于我尝试学习时出现的菜鸟问题表示歉意Python。期待加快速度并回馈社会
假设我有以下数据,
YEAR SECTOR PROFIT STARTMVYEAR TOTALPROFIT STARTMV
IBM TECHNOLOGY -500 2500 500 1500
APPLE TECHNOLOGY 800 4000 300 4500
GM INDUSTRIAL 250 1000 0 1250
CHRYSLER INDUSTRIAL 600 3000 100 3500
我想创建如下所示的摘要
SECTOR PROFITYEAR TOTALPROFIT
TECHNOLOGY .046 .133
INDUSTRIAL .213 .021
对于每个组,我们有 sum(PROFIT)/sum(STARTMVYEAR)
和 sum(TOTALPROFIT)/sum(STARTMV)
如果我只想为第一个基准测试做,我可以做
by_profit_totals =(df.groupby(['SECTOR'])['PROFIT'].sum()/by_first_count.groupby(['SECTOR'])['STARTMVYEAR'].sum())
但是我该怎么做呢?另外,是否有我可以使用的简单函数,例如 profit 和 startmvyear 以及 returns 汇总值?
您可以使用 groupby
with aggregating cython optimized
sum
and then div
by numpy array
created by values
:
g = df.groupby('SECTOR').sum()
print (g[['PROFIT','TOTALPROFIT']].div( g[['STARTMVYEAR','STARTMV']].values).reset_index())
SECTOR PROFIT TOTALPROFIT
0 INDUSTRIAL 0.212500 0.021053
1 TECHNOLOGY 0.046154 0.133333
对于我尝试学习时出现的菜鸟问题表示歉意Python。期待加快速度并回馈社会
假设我有以下数据,
YEAR SECTOR PROFIT STARTMVYEAR TOTALPROFIT STARTMV
IBM TECHNOLOGY -500 2500 500 1500
APPLE TECHNOLOGY 800 4000 300 4500
GM INDUSTRIAL 250 1000 0 1250
CHRYSLER INDUSTRIAL 600 3000 100 3500
我想创建如下所示的摘要
SECTOR PROFITYEAR TOTALPROFIT
TECHNOLOGY .046 .133
INDUSTRIAL .213 .021
对于每个组,我们有 sum(PROFIT)/sum(STARTMVYEAR)
和 sum(TOTALPROFIT)/sum(STARTMV)
如果我只想为第一个基准测试做,我可以做
by_profit_totals =(df.groupby(['SECTOR'])['PROFIT'].sum()/by_first_count.groupby(['SECTOR'])['STARTMVYEAR'].sum())
但是我该怎么做呢?另外,是否有我可以使用的简单函数,例如 profit 和 startmvyear 以及 returns 汇总值?
您可以使用 groupby
with aggregating cython optimized
sum
and then div
by numpy array
created by values
:
g = df.groupby('SECTOR').sum()
print (g[['PROFIT','TOTALPROFIT']].div( g[['STARTMVYEAR','STARTMV']].values).reset_index())
SECTOR PROFIT TOTALPROFIT
0 INDUSTRIAL 0.212500 0.021053
1 TECHNOLOGY 0.046154 0.133333