在 Python 中创建嵌套字典

Question

我在 Python 中有一本字典，目前看起来是这样的：

{'apple': ['file1.txt', 'file2.txt', 'file3.txt'], 'banana': ['file1.txt', 'file2.txt'],
'carrot': ['file3.txt'],.....................................}

我将每个文件的内容存储在一个列表列表中，其中包含该文件中的单词以及所用文件的一般列表：

[['hello', 'apple', 'test', 'banana'], ['weird', 'apple', 'tester', 'banana', 'apple'],........]]

['file1.txt', 'file2.txt', .....]

我现在想创建一个新的嵌套字典，其中包含前一个字典的所有信息，但还有术语在每个文档中出现的位置（如果存在）在该文档中）。

例如：我想要 print(dictionary['apple']) 到 return [{'file1.txt': [1]}, {'file2.txt': [1,4]},...... ] （它告诉我它出现在 中的文档和 它在该文档中的位置）

我现有的用于创建字典的代码是：


dict = {}
for i in range(len(textfile_list)): #list of textfiles used
    check = file_contents  #contents of file in form [['word1',..],['word2','wordn',...]]
    for item in words:#a list of every word from every file ['word1','wordn','word3',...]
  
        if item in check:
            if item not in dict:
                dict[item] = []
  
            if item in dict:
                dict[item].append(textfile_list[i])

dict = {k: list(set(v)) for k, v in dict.items()}

我该怎么做？

Answer 1

我可以像下面这样组织您的工作流程。以此作为灵感来源：

content = [['hello', 'apple', 'test', 'banana'], ['weird', 'apple', 'tester', 'banana', 'banana', 'apple']]
files = ['file1.txt', 'file2.txt']
index = {k:v for k, v in zip(files, content)}
words = set([word for words in index.values() for word in words])
expected_dict = {}
for word in words:
    expected_dict[word]=[]
    for key, value in index.items():
        if word in value:
            expected_dict[word].append({key:[idx for idx in range(len(value)) if value[idx]==word]})

输出：

{'test': [{'file1.txt': [2]}],
 'apple': [{'file1.txt': [1]}, {'file2.txt': [1, 5]}],
 'banana': [{'file1.txt': [3]}, {'file2.txt': [3, 4]}],
 'tester': [{'file2.txt': [2]}],
 'hello': [{'file1.txt': [0]}],
 'weird': [{'file2.txt': [0]}]}

在 Python 中创建嵌套字典

Creating a nested dictionary in Python

python

dictionary

nlp

list