Pandas 按索引计数分组和排序
Pandas group and sort by index count
说我有这个数据框
d = { 'Path' : ['abc', 'abc', 'ghi','ghi', 'jkl','jkl'],
'Detail' : ['foo', 'bar', 'bar','foo','foo','foo'],
'Program': ['prog1','prog1','prog1','prog2','prog3','prog3'],
'Value' : [30, 20, 10, 40, 40, 50],
'Field' : [50, 70, 10, 20, 30, 30] }
df = DataFrame(d)
df.set_index(['Path', 'Detail'], inplace=True)
df
Field Program Value
Path Detail
abc foo 50 prog1 30
bar 70 prog1 20
ghi bar 10 prog1 10
foo 20 prog2 40
jkl foo 30 prog3 40
foo 30 prog3 50
我可以按任何列对其进行分组和排序...
df_count = df.groupby('Program')
df_count.apply(lambda x: x.sort())
Program Path Detail Field Program Value
prog1 abc foo 50 prog1 30
ghi foo 20 prog1 40
jkl bar 10 prog1 30
prog2 abc bar 70 prog2 20
prog3 ghi foo 10 prog3 60
jkl foo 30 prog3 50
但我真正想要的是按计数对程序进行排序
df['Program'].value_counts()
prog1 3
prog3 2
prog2 1
dtype: int64
类似于
df_count.apply(lambda x: x.sort('Programs'.value_counts()))
最终目标是绘制它,使条形图按升序或降序排列。我该怎么做?
您可以将 count
添加为一列,然后您可以按它排序:
In [20]:
df['count'] = df['Program'].map(df['Program'].value_counts())
df
Out[20]:
Field Program Value count
Path Detail
abc foo 50 prog1 30 3
bar 70 prog1 20 3
ghi bar 10 prog1 10 3
foo 20 prog2 40 1
jkl foo 30 prog3 40 2
foo 30 prog3 50 2
In [23]:
df.sort('count', ascending=False)
Out[23]:
Field Program Value count
Path Detail
abc foo 50 prog1 30 3
bar 70 prog1 20 3
ghi bar 10 prog1 10 3
jkl foo 30 prog3 40 2
foo 30 prog3 50 2
ghi foo 20 prog2 40 1
说我有这个数据框
d = { 'Path' : ['abc', 'abc', 'ghi','ghi', 'jkl','jkl'],
'Detail' : ['foo', 'bar', 'bar','foo','foo','foo'],
'Program': ['prog1','prog1','prog1','prog2','prog3','prog3'],
'Value' : [30, 20, 10, 40, 40, 50],
'Field' : [50, 70, 10, 20, 30, 30] }
df = DataFrame(d)
df.set_index(['Path', 'Detail'], inplace=True)
df
Field Program Value
Path Detail
abc foo 50 prog1 30
bar 70 prog1 20
ghi bar 10 prog1 10
foo 20 prog2 40
jkl foo 30 prog3 40
foo 30 prog3 50
我可以按任何列对其进行分组和排序...
df_count = df.groupby('Program')
df_count.apply(lambda x: x.sort())
Program Path Detail Field Program Value
prog1 abc foo 50 prog1 30
ghi foo 20 prog1 40
jkl bar 10 prog1 30
prog2 abc bar 70 prog2 20
prog3 ghi foo 10 prog3 60
jkl foo 30 prog3 50
但我真正想要的是按计数对程序进行排序
df['Program'].value_counts()
prog1 3
prog3 2
prog2 1
dtype: int64
类似于
df_count.apply(lambda x: x.sort('Programs'.value_counts()))
最终目标是绘制它,使条形图按升序或降序排列。我该怎么做?
您可以将 count
添加为一列,然后您可以按它排序:
In [20]:
df['count'] = df['Program'].map(df['Program'].value_counts())
df
Out[20]:
Field Program Value count
Path Detail
abc foo 50 prog1 30 3
bar 70 prog1 20 3
ghi bar 10 prog1 10 3
foo 20 prog2 40 1
jkl foo 30 prog3 40 2
foo 30 prog3 50 2
In [23]:
df.sort('count', ascending=False)
Out[23]:
Field Program Value count
Path Detail
abc foo 50 prog1 30 3
bar 70 prog1 20 3
ghi bar 10 prog1 10 3
jkl foo 30 prog3 40 2
foo 30 prog3 50 2
ghi foo 20 prog2 40 1