在多列索引 pandas DataFrame 中将列级别移至顶部
Move column level to top in multi column index pandas DataFrame
在 pandas 多列索引中将特定列级别移动到顶部的 pythonic 方法是什么?
玩具示例:
import numpy as np
import pandas as pd
cols = pd.MultiIndex.from_arrays(
[
["a1", "a1", "a1", "a1"],
["b1", "b1", "b2", "b2"],
["x1", "x1", "x1", "x1"],
["c1", "c1", "c1", "c1"],
],
names=(
"unknown_level_name_0",
"unknown_level_name_1",
"known_level_name",
"unknown_level_name_last",
),
)
df = pd.DataFrame(np.random.randint(0, 100, [5, 4]), columns=cols)
print(df)
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
known_level_name x1 x1
unknown_level_name_last c1 c1 c1 c1
0 37 34 97 19
1 54 47 53 46
2 63 94 14 85
3 16 51 27 96
4 32 64 76 26
我正在寻找以下结果:
known_level_name x1 x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 37 34 97 19
1 54 47 53 46
2 63 94 14 85
3 16 51 27 96
4 32 64 76 26
编辑:
可以有可变数量的级别。大多数级别名称是未知的。但是,总会有一个熟悉的关卡名称(此处为:“known_level_name”)。
如果我不知道“known_level_name”的确切位置,使用 reorder_levels
或 swaplevel
可能会变得棘手。
你可以试试reorder_levels
out = df.reorder_levels([2,0,1,3], axis=1)
Out[184]:
known_level_name x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 98 32 72 94
1 22 71 15 2
2 25 41 42 38
3 87 74 41 82
4 87 31 18 8
使用reorder_levels
(also available as DataFrame method):
# by index
df.columns = df.columns.reorder_levels(order=[2,0,1,3])
或:
# by name
df.columns = df.columns.reorder_levels(order=['unknown_level_name_1',
'known_level_name',
'unknown_level_name_0',
'unknown_level_name_last'
])
替代,多个swaplevel
:
# move level 2 up # move again up
df.swaplevel(2,1,axis=1).swaplevel(1,0,axis=1)
输出:
known_level_name x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 17 68 61 88
1 6 62 81 7
2 82 16 85 92
3 40 22 48 0
4 35 46 68 60
这是一个将列级别(按标签或索引)移动到顶部的通用函数:
def move_top(df, col, inplace=False):
if col in df.columns.names:
idx = df.columns.names.index(col)
elif isinstance(col, int) and 0 < col < len(df.columns.names):
idx = col
else:
raise IndexError(f'invalid index "{col}"')
order = list(range(len(df.columns.names)))
order.pop(idx)
order = [idx]+order
if inplace:
df.columns = df.columns.reorder_levels(order=order)
else:
return df.reorder_levels(order, axis=1)
move_top(df, 'known_level_name')
输出:
known_level_name x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 33 30 23 77
1 10 73 80 33
2 7 54 52 9
3 71 99 22 22
4 83 15 86 40
在 pandas 多列索引中将特定列级别移动到顶部的 pythonic 方法是什么?
玩具示例:
import numpy as np
import pandas as pd
cols = pd.MultiIndex.from_arrays(
[
["a1", "a1", "a1", "a1"],
["b1", "b1", "b2", "b2"],
["x1", "x1", "x1", "x1"],
["c1", "c1", "c1", "c1"],
],
names=(
"unknown_level_name_0",
"unknown_level_name_1",
"known_level_name",
"unknown_level_name_last",
),
)
df = pd.DataFrame(np.random.randint(0, 100, [5, 4]), columns=cols)
print(df)
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
known_level_name x1 x1
unknown_level_name_last c1 c1 c1 c1
0 37 34 97 19
1 54 47 53 46
2 63 94 14 85
3 16 51 27 96
4 32 64 76 26
我正在寻找以下结果:
known_level_name x1 x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 37 34 97 19
1 54 47 53 46
2 63 94 14 85
3 16 51 27 96
4 32 64 76 26
编辑:
可以有可变数量的级别。大多数级别名称是未知的。但是,总会有一个熟悉的关卡名称(此处为:“known_level_name”)。
如果我不知道“known_level_name”的确切位置,使用 reorder_levels
或 swaplevel
可能会变得棘手。
你可以试试reorder_levels
out = df.reorder_levels([2,0,1,3], axis=1)
Out[184]:
known_level_name x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 98 32 72 94
1 22 71 15 2
2 25 41 42 38
3 87 74 41 82
4 87 31 18 8
使用reorder_levels
(also available as DataFrame method):
# by index
df.columns = df.columns.reorder_levels(order=[2,0,1,3])
或:
# by name
df.columns = df.columns.reorder_levels(order=['unknown_level_name_1',
'known_level_name',
'unknown_level_name_0',
'unknown_level_name_last'
])
替代,多个swaplevel
:
# move level 2 up # move again up
df.swaplevel(2,1,axis=1).swaplevel(1,0,axis=1)
输出:
known_level_name x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 17 68 61 88
1 6 62 81 7
2 82 16 85 92
3 40 22 48 0
4 35 46 68 60
这是一个将列级别(按标签或索引)移动到顶部的通用函数:
def move_top(df, col, inplace=False):
if col in df.columns.names:
idx = df.columns.names.index(col)
elif isinstance(col, int) and 0 < col < len(df.columns.names):
idx = col
else:
raise IndexError(f'invalid index "{col}"')
order = list(range(len(df.columns.names)))
order.pop(idx)
order = [idx]+order
if inplace:
df.columns = df.columns.reorder_levels(order=order)
else:
return df.reorder_levels(order, axis=1)
move_top(df, 'known_level_name')
输出:
known_level_name x1
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 33 30 23 77
1 10 73 80 33
2 7 54 52 9
3 71 99 22 22
4 83 15 86 40