如何在不更改其他级别顺序的情况下按一个级别对 MultiIndex 进行排序

Question

我正在努力根据 MultiIndex 的一个级别对枢轴 table 进行排序。我的目标是根据基本有效的值列表对关卡中的值进行排序。但我也想保留其他关卡的原始顺序。

import pandas as pd
import numpy as np
import random

group_size = 3
n = 10
df = pd.DataFrame({
    'i_a': list(np.arange(0, group_size))*n,
    'i_b': random.choices(list("ARBMC"), k=n*group_size),
    'value': np.random.randint(0, 100, size=n*group_size),
})

pt = pd.pivot_table(
    df,
    index=['i_a', 'i_b'],
    values=['value'],
    aggfunc='sum'
)
# The pivot table looks like this
         value
i_a i_b       
0   A       48
    B       55
    C      161
    M       41
    R      126
1   A       60
    B      236
    C       99
    M       30
    R      202
2   A       22
    B      144
    C       30
    M      146
    R      168

# defined order for i_b
ORDER = {
    "A": 0,
    "R": 1,
    "B": 2,
    "M": 3,
    "C": 4,
}

def order_by_list(value, ascending=True):
    try:
        idx = ORDER[value]
    except KeyError:
        # place items which are not available at the last place
        idx = len(ORDER)
    if not ascending:
        # reverse the order
        idx = -idx
    return idx

def sort_by_ib(df):
    return pt.sort_index(level=["i_b"], 
                         key=lambda index: index.map(order_by_list), 
                         sort_remaining=False
                         )

pt_sorted = pt.pipe(sort_by_ib)

# i_a index of pt_sorted is rearranged what i dont want
         value
i_a i_b       
0   A       48
1   A       60
2   A       22
0   R      126
1   R      202
2   R      168
0   B       55
1   B      236
2   B      144
0   M       41
1   M       30
2   M      146
0   C      161
1   C       99
2   C       30


# Instead, The sorted pivot table should look like this
         value
i_a i_b       
0   A       48
    R      126
    B       55
    M       41
    C      161
1   A       60
    R      202
    B      236
    M       30
    C       99
2   A       22
    R      168
    B      144
    M      146
    C       30

执行此操作的 preferred/recommended 方法是什么？

Answer 1

如果想要更改顺序，您可以为映射创建辅助列，添加到 pivot_table 中的 index 参数，最后由 droplevel 删除。如果在 i_b 之前添加，则按 id_a 和 new 级别排序：

df['new'] = df['i_b'].map(ORDER)
pt = pd.pivot_table(
    df,
    index=[ 'i_a','new', 'i_b'],
    values=['value'],
    aggfunc='sum'
).droplevel(1)

print (pt)
         value
i_a i_b       
0   A      217
    R      135
    M      150
    C       43
1   A       44
    R      266
    B       44
    M       13
    C      128
2   A      167
    R        3
    B       85
    M      159
    C       81

如何在不更改其他级别顺序的情况下按一个级别对 MultiIndex 进行排序

How to sort a MultiIndex by one level without changing the order of the other levels

python

sorting

multi-index

pandas