如何处理多级嵌套字典和列表 Python
how to deal with multilevel of nested dicts and lists Python
我有一个字典列表和一个字典列表
对象示例:
my_obj = [
{
"weight": 3000,
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": 8.5
},
{
"date": datetime.datetime(2020, 11, 4, 0, 0),
"value": 9.3
},
{...}
]
},
{
"weight": 2000,
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": 8.2
},
{
"date": datetime.datetime(2020, 11, 4, 0, 0),
"value": 8
},
{...}
]
},
{...}
]
我需要用这些值和权重做一些数学运算,return一个唯一的数据列表
预期结果:
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": '(
( first_nested_data_list(value[0]) * first_nested_data_list(weight) )+
( second_nested_data_list(value[0]) * second_nested_data_list(weight) )+
( third_nested_data_list(value[0]) * third_nested_data_list(weight) )
) / sum(all_weight)'
},
{
"date": datetime.datetime(2020, 11, 4, 0, 0),
"value": '(
( first_nested_data_list(value[1]) * first_nested_data_list(weight) ) +
( second_nested_data_list(value[1]) * second_nested_data_list(weight) ) +
( third_nested_data_list(value[1]) * third_nested_data_list(weight) )
) / sum(all_weight)'
},
{...}
]
# or
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": ( (8.5 * 3000) + (8.2 * 2000) ) / 5000
},
{...}
]
尝试使用 zip,但由于我不知道 my_obj 长度,因此无法解决此问题
任何帮助将不胜感激!
我会分两步处理这个问题。首先转置您的数据,以便项目按日期分组,而不是像现在这样按重量分组。然后在另一步中找到每个日期的加权平均值。
由于您需要能够按天查找权重和值,我将使用日期作为中间映射中的键(在将其转换回 JSON 之类的映射之前,标签为最后的键):
transposed = {} # build a dict in this format: {date: [(weight, value), ...]}
for wdict in my_obj:
weight = wdict["weight"]
for ddict in wdict["data"]:
date = ddict["date"]
value = ddict["value"]
transposed.setdefault(date, []).append((weight, value))
results = [] # format is [{"date": date, "value": weighted_average}, ...]
for date, weighted_values_list in transposed.items():
weighted_average = (sum(weight * value for weight, value in weighted_values_list) /
sum(weight for weight, value in weighted_values_list))
results.append({"date": date, "value": weighted_average})
# optionally wrap the results list in another dictionary
# final_results = {data: results}
您可以更改转换步骤以直接对加权值和权重求和,而不是列出它们以便稍后求和。那么第二步的计算就可以是求加权平均的除法。但我更喜欢基于列表的方法,虽然我不太确定为什么。
您可以使用嵌套 defaultdict
作为按日期分组的数据的临时存储,然后遍历此临时存储计算每个日期的平均值并以所需形式保存。
代码:
from collections import defaultdict
my_obj = [ ... ]
temp = defaultdict(lambda: defaultdict(int))
for obj in my_obj:
for i in obj["data"]:
temp[i["date"]]["sum"] += i["value"] * obj["weight"]
temp[i["date"]]["weight"] += obj["weight"]
result = [{"date": k, "value": v["sum"] / v["weight"]} for k, v in temp.items()]
# add {"data": [ ... ]} if you need result in form provided in question
我有一个字典列表和一个字典列表
对象示例:
my_obj = [
{
"weight": 3000,
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": 8.5
},
{
"date": datetime.datetime(2020, 11, 4, 0, 0),
"value": 9.3
},
{...}
]
},
{
"weight": 2000,
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": 8.2
},
{
"date": datetime.datetime(2020, 11, 4, 0, 0),
"value": 8
},
{...}
]
},
{...}
]
我需要用这些值和权重做一些数学运算,return一个唯一的数据列表
预期结果:
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": '(
( first_nested_data_list(value[0]) * first_nested_data_list(weight) )+
( second_nested_data_list(value[0]) * second_nested_data_list(weight) )+
( third_nested_data_list(value[0]) * third_nested_data_list(weight) )
) / sum(all_weight)'
},
{
"date": datetime.datetime(2020, 11, 4, 0, 0),
"value": '(
( first_nested_data_list(value[1]) * first_nested_data_list(weight) ) +
( second_nested_data_list(value[1]) * second_nested_data_list(weight) ) +
( third_nested_data_list(value[1]) * third_nested_data_list(weight) )
) / sum(all_weight)'
},
{...}
]
# or
"data": [
{
"date": datetime.datetime(2020, 11, 3, 0, 0),
"value": ( (8.5 * 3000) + (8.2 * 2000) ) / 5000
},
{...}
]
尝试使用 zip,但由于我不知道 my_obj 长度,因此无法解决此问题
任何帮助将不胜感激!
我会分两步处理这个问题。首先转置您的数据,以便项目按日期分组,而不是像现在这样按重量分组。然后在另一步中找到每个日期的加权平均值。
由于您需要能够按天查找权重和值,我将使用日期作为中间映射中的键(在将其转换回 JSON 之类的映射之前,标签为最后的键):
transposed = {} # build a dict in this format: {date: [(weight, value), ...]}
for wdict in my_obj:
weight = wdict["weight"]
for ddict in wdict["data"]:
date = ddict["date"]
value = ddict["value"]
transposed.setdefault(date, []).append((weight, value))
results = [] # format is [{"date": date, "value": weighted_average}, ...]
for date, weighted_values_list in transposed.items():
weighted_average = (sum(weight * value for weight, value in weighted_values_list) /
sum(weight for weight, value in weighted_values_list))
results.append({"date": date, "value": weighted_average})
# optionally wrap the results list in another dictionary
# final_results = {data: results}
您可以更改转换步骤以直接对加权值和权重求和,而不是列出它们以便稍后求和。那么第二步的计算就可以是求加权平均的除法。但我更喜欢基于列表的方法,虽然我不太确定为什么。
您可以使用嵌套 defaultdict
作为按日期分组的数据的临时存储,然后遍历此临时存储计算每个日期的平均值并以所需形式保存。
代码:
from collections import defaultdict
my_obj = [ ... ]
temp = defaultdict(lambda: defaultdict(int))
for obj in my_obj:
for i in obj["data"]:
temp[i["date"]]["sum"] += i["value"] * obj["weight"]
temp[i["date"]]["weight"] += obj["weight"]
result = [{"date": k, "value": v["sum"] / v["weight"]} for k, v in temp.items()]
# add {"data": [ ... ]} if you need result in form provided in question