如何添加多列作为 groupby pandas python 的结果

Question

假设我有一个数据框：

date | brand | color
--------------------
2017 | BMW   | red
2017 | GM    | blue
2017 | BMW   | blue
2017 | BMW   | red
2018 | BMW   | green
2018 | GM    | blue
2018 | GM    | blue
2018 | GM    | red

因此我想要这样的东西：

date | brand | red | blue | green
---------------------------------
2017 | BMW   |  2  |  1   |   0
     |  GM   |  0  |  1   |   0
2018 | BMW   |  0  |  0   |   1
     |  GM   |  1  |  2   |   0

我发现我需要使用 groupby + size，比如：

df[df['color'] == 'red'].groupby([df['date'], df['brand']]).size()

但这只为我提供了单色系列，而我想要完整的数据框，如上图所示。

Answer 1

df.groupby(['date','brand'])['red','blue','green'].count()

或...

df.groupby(['date','brand']).agg('count')

Answer 2

就像你看到的一样简单..

选项 1 crosstab

pd.crosstab([df['date'],df['brand']], df['color'])
Out[30]: 
 color          blue   green   red
date   brand                      
2017   BMW         1       0     2
       GM          1       0     0
2018   BMW         0       1     0
       GM          2       0     1

选项 2：groupby 和 unstack

df.groupby(['date ',' brand ',' color'])[' color'].count().unstack(-1).fillna(0)
Out[40]: 
 color          blue   green   red
date   brand                      
2017   BMW       1.0     0.0   2.0
       GM        1.0     0.0   0.0
2018   BMW       0.0     1.0   0.0
       GM        2.0     0.0   1.0

选项 3 pivot_table

pd.pivot_table(df.reset_index(),index=['date','brand'],columns='color',values='index',aggfunc='count').fillna(0)
Out[57]: 
color          blue   green   red
date brand                       
2017  BMW       1.0     0.0   2.0
      GM        1.0     0.0   0.0
2018  BMW       0.0     1.0   0.0
      GM        2.0     0.0   1.0

如何添加多列作为 groupby pandas python 的结果

How to add multiple columns as a result of groupby pandas python

python

pandas

pandas-groupby