将多列转换为 pandas 中的嵌套字典列表

Transforming multiple columns into a list of nested dictionaries in pandas

我有一个 pandas 数据框,其外观如下

category    sub_cat     vitals      value
HR          EKG         HR_EKG      136
SPO2        SPO2        SpO2_1      86
HR          PPG         HR_PPG_1    135
SPO2        PI          PI_1        4.25
HR          PPG         HR_PULSED   135
NIBP        SBP         NIBPS       73
NIBP        DBP         NIBPD       25
NIBP        MBP         NIBPM       53

我想按类别和 sub_cat 列分组,并将其转换为嵌套字典列表,类似这样

[{
    "HR":
    {
        "EKG":
        {
            "HR_EKG": 136
        },
        "PPG":
        {
            "HR_PPG_1": 135,
            "HR_PULSED": 135
        }
    }
  },
  {
    "NIBP":
    {
        "SBP":
        {
            "NIBPS": 73
        },
        "DBP":
        {
            "NIBPD": 25
        },
        "MBP":
        {
            "NIBPM": 53
        }
    }
  },
  {
    "SPO2":
    {
        "SPO2":
        {
            "SpO2_1": 86
        },
        "PI":
        {
            "PI_1": 4.25
        }
    }
}]

我能够按(类别、生命体征和价值)或(子类别、生命体征和价值)进行分组,但无法按所有 4 列进行分组。这是我尝试并适用于 3 列的方法

df = df.groupby(['sub_cat']).apply(lambda x: dict(zip(x['vitals'], x['value'])))

一系列嵌套的 groupby + apply + to_dict 调用即可:

dct = df.groupby('category').apply(
    lambda category: category.groupby('sub_cat').apply(
        lambda sub_cat: sub_cat.set_index('vitals')['value'].to_dict()
    ).to_dict()
).to_dict()

输出:

>>> import json
>>> print(json.dumps(dct, indent=4))
{
    "HR": {
        "EKG": {
            "HR_EKG": 136.0
        },
        "PPG": {
            "HR_PPG_1": 135.0,
            "HR_PULSED": 135.0
        }
    },
    "NIBP": {
        "DBP": {
            "NIBPD": 25.0
        },
        "MBP": {
            "NIBPM": 53.0
        },
        "SBP": {
            "NIBPS": 73.0
        }
    },
    "SPO2": {
        "PI": {
            "PI_1": 4.25
        },
        "SPO2": {
            "SpO2_1": 86.0
        }
    }
}