如何拆分一个多索引数据框,其中包含一列充满不同键的字典

How do I split out a multi-index dataframe with a column full of dictionaries with different keys

我有一个数据框如下(它有多个变量,但我只关心将字典列变成一个单独的数据框)

| Index    | Attributes     | Day  | Colour|
| -------- | -------------- | ---- |-------|
| Alpha    | {A1: 1, A2: 2} | Mon  |Black  |
| Bravo    | {A1: 3, B1: 4} | Mon  |Red   |
| Charlie  | {C1: 5, A2: 6} | Mon  |Yellow|

我只想要前两个变量,如何将其拆分成这样

| Index    | A1   | A2   | B1 | C1|
| -------- | ---- | ---- |----|----|
| Alpha    |1     |2     |N/A |N/A |
| Bravo    |3     |N/A   |4   |N/A |
| Charlie  |N/A   |6     |N/A |5   |

我真的被这个问题难住了,这是我尝试的代码:

new_df = pd.DataFrame(columns = ['Index'])
new_df['Index'] = old_df['Index'        
attribute_df = pd.Dataframe(old_df['attributes'])
new_df = pd.concat(new_df, attribute_df)

没用!

假设列 Index 实际上是框架的索引 use apply pd.Series :

new_df = df['Attributes'].apply(pd.Series)
          A1   A2   B2   C1
Index                      
Alpha    1.0  2.0  NaN  NaN
Bravo    3.0  NaN  4.0  NaN
Charlie  NaN  6.0  NaN  5.0

假设 Index 是一个列,添加一个 join 以合并回 DataFrame(使用此选项还可以保存比索引更多的列):

new_df = df[['Index']].join(df['Attributes'].apply(pd.Series))
     Index   A1   A2   B2   C1
0    Alpha  1.0  2.0  NaN  NaN
1    Bravo  3.0  NaN  4.0  NaN
2  Charlie  NaN  6.0  NaN  5.0


完整的工作示例:

import pandas as pd

df = pd.DataFrame({
    'Index': ['Alpha', 'Bravo', 'Charlie'],
    'Attributes': [{'A1': 1, 'A2': 2}, {'A1': 3, 'B2': 4}, {'C1': 5, 'A2': 6}],
    'Day': ['Mon', 'Mon', 'Mon'],
    'Colour': ['Black', 'Red', 'Yellow']
}).set_index('Index')

new_df = df['Attributes'].apply(pd.Series)
print(new_df)