在字典中制作字典以通过一列中的相同值分隔数据,然后从第二列中分隔数据

Making dictionary in dictionary to separate data by the same values in one column and then from second column

我是 Python 的新人,几天来我一直被一个问题困扰。我制作了一个脚本:

-从 CSV 文件中获取数据-按数据文件第一列中的相同值对其进行排序 -在不同模板文本文件的特定行中插入排序数据 - 将文件保存为尽可能多的副本,因为数据文件的第一列中有不同的值下图显示了它是如何工作的:

但是我还有两件事需要做。当在如上所示的单独文件中时,数据文件的第二列中有一些相同的值,则此文件应插入第三列中的值,而不是重复第二列中的相同值。在下面的图片中,我展示了它应该是什么样子:

我还需要在某处添加数据文件中第一列的分隔值“_”。

有数据文件:

111_0,3005,QWE
111_0,3006,SDE
111_0,3006,LFR
111_1,3005,QWE
111_1,5345,JTR
112_0,3103,JPP
112_0,3343,PDK 
113_0,2137,TRE
113_0,2137,OMG

还有我编写的代码:

import shutil

with open("data.csv") as f:
    contents = f.read()
    contents = contents.splitlines()

values_per_baseline = dict()

for line in contents:
    key = line.split(',')[0]
    values = line.split(',')[1:]
    if key not in values_per_baseline:
        values_per_baseline[key] = []
    values_per_baseline[key].append(values)

for file in values_per_baseline.keys():
    x = 3
    shutil.copyfile("of.txt", (f"of_%s.txt" % file))
    filename = f"of_%s.txt" % file
    for values in values_per_baseline[file]:
        with open(filename, "r") as f:
            contents = f.readlines()
            contents.insert(x, '      o = ' + values[0] + '\n          ' + 'a = ' + values[1] +'\n')
        with open(filename, "w") as f:
            contents = "".join(contents)
            f.write(contents)
            f.close()

我一直在尝试制作类似列表字典的字典,但我无法以正确的方式实现它以使其工作。任何帮助或建议将不胜感激。

datafile.csv的内容:

111_0,3005,QWE
111_0,3006,SDE
111_0,3006,LFR
111_1,3005,QWE
111_1,5345,JTR
112_0,3103,JPP
112_0,3343,PDK 
113_0,2137,TRE
113_0,2137,OMG

可能的解决方案如下:

def nested_list_to_dict(lst):
    result = {}
    subgroup = {}
    if all(len(l) == 3 for l in lst):
        for first, second, third in lst:
            result.setdefault(first, []).append((second, third))
        for k, v in result.items():
            for item1, item2 in v:
                subgroup.setdefault(item1, []).append(item2.strip())
            result[k] = subgroup
            subgroup = {}
    else:
        print("Input data must have 3 items like '111_0,3005,QWE'")
    return result


with open("datafile.csv", "r", encoding="utf-8") as f:
    content = f.read().splitlines()

data = nested_list_to_dict([line.split(',') for line in content])
print(data)

# ... rest of your code ....

版画

{'111_0': {'3005': ['QWE'], '3006': ['SDE', 'LFR']}, 
 '111_1': {'3005': ['QWE'], '5345': ['JTR']}, 
 '112_0': {'3103': ['JPP'], '3343': ['PDK']}, 
 '113_0': {'2137': ['TRE', 'OMG']}}

您可以尝试以下方法:

import csv
from collections import defaultdict


values_per_baseline = defaultdict(lambda: defaultdict(list))
with open("data.csv", "r") as file:
    for key1, key2, value in csv.reader(file):
        values_per_baseline[key1][key2].append(value)

x = 3
for filekey, content in values_per_baseline.items():
    with open("of.txt", "r") as fin,\
         open(f"of_{filekey}.txt", "w") as fout:
        fout.writelines(next(fin) for _ in range(x))
        for key, values in content.items():
            fout.write(
                f'      o = {key}\n'
                + '          a = '
                + ' <COMMA> '.join(values)
                + '\n'
            )
        fout.writelines(fin)

input-reading 部分正在使用 csv module from the standard library (for convenience) and a defaultdict。文件被读入嵌套字典。