pandas 按多列分层分组
pandas hierarchical group by multiple columns
我想按列 'Number 3' 和 'Event' 进行分组,并获得如下所示的所需结果。请注意,第一列是索引。我想将所需的结果保存到新的数据框中。
Number1 Event Number2 Number3
0 20 clouds 30 404
1 22 lightening 32 404
2 23 playing 33 405
3 25 clouds 35 410
4 24 sleeping 34 407
5 26 lightening 36 410
6 21 rain 31 404
7 27 rain 37 410
Derired Result:
Number3 Event Number1 Number2
404 0 clouds 20 30
1 lightening 22 32
6 rain 21 31
405 2 playing 23 33
410 3 clouds 25 35
6 lightening 26 36
7 rain 27 37
407 4 sleeping 24 34
需要set_index
:
df1 = df.set_index(['Number3', 'Event'])
print (df1)
Number1 Number2
Number3 Event
404 clouds 20 30
lightening 21 31
rain 22 32
405 playing 23 33
410 sun 24 34
420 clouds 25 35
lightening 26 36
rain 27 37
但如果需要旧 index
也添加参数 append=True
然后 swaplevel
:
df1 = df.set_index(['Number3', 'Event'], append=True).swaplevel(0,1)
print (df1)
Number1 Number2
Number3 Event
404 0 clouds 20 30
1 lightening 21 31
2 rain 22 32
405 3 playing 23 33
410 4 sun 24 34
420 5 clouds 25 35
6 lightening 26 36
7 rain 27 37
编辑问题:
添加sort_index
:
df1 = df.set_index(['Number3', 'Event'], append=True)
.swaplevel(0,1)
.sort_index(level='Number3')
print (df1)
Number1 Number2
Number3 Event
404 0 clouds 20 30
1 lightening 22 32
6 rain 21 31
405 2 playing 23 33
407 4 sleeping 24 34
410 3 clouds 25 35
5 lightening 26 36
7 rain 27 37
我想按列 'Number 3' 和 'Event' 进行分组,并获得如下所示的所需结果。请注意,第一列是索引。我想将所需的结果保存到新的数据框中。
Number1 Event Number2 Number3
0 20 clouds 30 404
1 22 lightening 32 404
2 23 playing 33 405
3 25 clouds 35 410
4 24 sleeping 34 407
5 26 lightening 36 410
6 21 rain 31 404
7 27 rain 37 410
Derired Result:
Number3 Event Number1 Number2
404 0 clouds 20 30
1 lightening 22 32
6 rain 21 31
405 2 playing 23 33
410 3 clouds 25 35
6 lightening 26 36
7 rain 27 37
407 4 sleeping 24 34
需要set_index
:
df1 = df.set_index(['Number3', 'Event'])
print (df1)
Number1 Number2
Number3 Event
404 clouds 20 30
lightening 21 31
rain 22 32
405 playing 23 33
410 sun 24 34
420 clouds 25 35
lightening 26 36
rain 27 37
但如果需要旧 index
也添加参数 append=True
然后 swaplevel
:
df1 = df.set_index(['Number3', 'Event'], append=True).swaplevel(0,1)
print (df1)
Number1 Number2
Number3 Event
404 0 clouds 20 30
1 lightening 21 31
2 rain 22 32
405 3 playing 23 33
410 4 sun 24 34
420 5 clouds 25 35
6 lightening 26 36
7 rain 27 37
编辑问题:
添加sort_index
:
df1 = df.set_index(['Number3', 'Event'], append=True)
.swaplevel(0,1)
.sort_index(level='Number3')
print (df1)
Number1 Number2
Number3 Event
404 0 clouds 20 30
1 lightening 22 32
6 rain 21 31
405 2 playing 23 33
407 4 sleeping 24 34
410 3 clouds 25 35
5 lightening 26 36
7 rain 27 37