如何处理多级嵌套字典和列表 Python

how to deal with multilevel of nested dicts and lists Python

我有一个字典列表和一个字典列表

对象示例:

my_obj = [
    {
        "weight": 3000,
        "data": [
            {
                "date": datetime.datetime(2020, 11, 3, 0, 0),
                "value": 8.5
            },
            {
                "date": datetime.datetime(2020, 11, 4, 0, 0),
                "value": 9.3
            },
            {...}
        ]
    },
    {
        "weight": 2000,
        "data": [
            {
                "date": datetime.datetime(2020, 11, 3, 0, 0),
                "value": 8.2
            },
            {
                "date": datetime.datetime(2020, 11, 4, 0, 0),
                "value": 8
            },
            {...}
        ]
    },
    {...}
]

我需要用这些值和权重做一些数学运算,return一个唯一的数据列表

预期结果:

"data": [
    {
        "date": datetime.datetime(2020, 11, 3, 0, 0),
        "value": '(
        ( first_nested_data_list(value[0]) * first_nested_data_list(weight) )+
        ( second_nested_data_list(value[0]) * second_nested_data_list(weight) )+
        ( third_nested_data_list(value[0]) * third_nested_data_list(weight) )
        ) / sum(all_weight)'
    },
    {
        "date": datetime.datetime(2020, 11, 4, 0, 0),
        "value": '(
        ( first_nested_data_list(value[1]) * first_nested_data_list(weight) ) +
        ( second_nested_data_list(value[1]) * second_nested_data_list(weight) ) +
        ( third_nested_data_list(value[1]) * third_nested_data_list(weight) )
        ) / sum(all_weight)'
    },
    {...}
]

# or

"data": [
    {
        "date": datetime.datetime(2020, 11, 3, 0, 0),
        "value": ( (8.5 * 3000) + (8.2 * 2000) ) / 5000
    },
    {...}
]

尝试使用 zip,但由于我不知道 my_obj 长度,因此无法解决此问题

任何帮助将不胜感激!

我会分两步处理这个问题。首先转置您的数据,以便项目按日期分组,而不是像现在这样按重量分组。然后在另一步中找到每个日期的加权平均值。

由于您需要能够按天查找权重和值,我将使用日期作为中间映射中的键(在将其转换回 JSON 之类的映射之前,标签为最后的键):

transposed = {}         # build a dict in this format: {date: [(weight, value), ...]}
for wdict in my_obj:
    weight = wdict["weight"]
    for ddict in wdict["data"]:
        date = ddict["date"]
        value = ddict["value"]
        transposed.setdefault(date, []).append((weight, value))

results = []           # format is [{"date": date, "value": weighted_average}, ...]
for date, weighted_values_list in transposed.items():
    weighted_average = (sum(weight * value for weight, value in weighted_values_list) /
                        sum(weight for weight, value in weighted_values_list))
    results.append({"date": date, "value": weighted_average})

# optionally wrap the results list in another dictionary
# final_results = {data: results}

您可以更改转换步骤以直接对加权值和权重求和,而不是列出它们以便稍后求和。那么第二步的计算就可以是求加权平均的除法。但我更喜欢基于列表的方法,虽然我不太确定为什么。

您可以使用嵌套 defaultdict 作为按日期分组的数据的临时存储,然后遍历此临时存储计算每个日期的平均值并以所需形式保存。

代码:

from collections import defaultdict

my_obj = [ ... ]

temp = defaultdict(lambda: defaultdict(int))
for obj in my_obj:
    for i in obj["data"]:
        temp[i["date"]]["sum"] += i["value"] * obj["weight"]
        temp[i["date"]]["weight"] += obj["weight"]
result = [{"date": k, "value": v["sum"] / v["weight"]} for k, v in temp.items()]
# add {"data": [ ... ]} if you need result in form provided in question