Pandas: 如何提取已经分组的数据

Question

下面是演示我的问题的示例代码：

import numpy as np
import pandas as pd

np.random.seed(10)

df = pd.DataFrame(np.random.randint(0,10,size=(100, 2)), columns=list('xy'))

df

    x   y
0   9   4
1   0   1
2   9   0
3   1   8
4   9   0
... ... ...
95  0   4
96  6   4
97  9   8
98  0   7
99  1   7

groups = df.groupby(['x'])

groups.size()

x
0    11
1    12
2    15
3    13
4    14
5     5
6     6
7     9
8     5
9    10
dtype: int64

如何访问作为列的 x 值和作为第二列的聚合 y 值以绘制 x 与 y 的关系图？

Answer 1

两个选项。

使用reset_index():

groups = df.groupby(['x']).size().reset_index(name='size')

将as_index=False添加到groupby:

groups = df.groupby(['x'], as_index=False).size()

两者的输出：

Answer 2

IIUC，使用as_index=False:

groups = df.groupby(['x'], as_index=False)
out = groups.size()
out.plot(x='x', y='size')

如果只想画图，也可以保留x为索引：

df.groupby(['x']).size().plot()

输出：

Pandas: 如何提取已经分组的数据

Pandas: How to extract data that has been grouped by

group-by

numpy

pandas