计算数据框列中出现一堆值的频率

Question

对 python 和 pandas 很陌生，我的数据框的一列中有 15000 个值，就像这样。

col1	col2
5	0.05964
19	0.00325
31	0.0225
12	0.03325
14	0.00525

我想输出这样的结果:

0.00 to 0.01 = 55 values, 
0.01 to 0.02 = 365 values, 
0.02 to 0.03 = 5464 values etc... from 0.00 to 1.00

我对 groupby 或 count.values 等有点迷茫...

感谢您的帮助！

Answer 1

IIUC，使用pd.cut:

out = df.groupby(pd.cut(df['col2'], np.linspace(0, 1, 101)))['col1'].sum()
print(out)

# Output
col2
(0.0, 0.01]     33
(0.01, 0.02]     0
(0.02, 0.03]    31
(0.03, 0.04]    12
(0.04, 0.05]     0
                ..
(0.95, 0.96]     0
(0.96, 0.97]     0
(0.97, 0.98]     0
(0.98, 0.99]     0
(0.99, 1.0]      0
Name: col1, Length: 100, dtype: int64

计算数据框列中出现一堆值的频率

Count the frequency that a bunch of values occurs in a dataframe column

python

frequency

count

pandas