在没有 Panda 或 NumPy 的情况下对 Python 中的(小)数据集进行反向索引

Reverse Indexing a (small) dataset in Python without Panda or NumPy

我是一个总计 python 的初学者,正在努力学习 python 课程,但我被这个问题难住了: 我只是回来

{'python': [2], 'rules': [2]} 
#  (all words should be lowercase)

而不是整个集合,应该是:

{'python': [0, 2],'time': [0, 1],'it': [1],'is': [1],'that': [1],'rules':[2]}

如有任何帮助,我们将不胜感激!

from collections import defaultdict

dataset = [
    "Python time",
    "It is that TIME",
    "python rules"
 ] 

index_dictionary = {}

def reverse_index(dataset):
    
    for index in range(len(dataset)):
        phrase = dataset[index]
        words = phrase.lower()
        wordlist = words.split()

    for x in wordlist:
        if x in index_dictionary.keys():
            index_dictionary[x].append(index)
        else:
            index_dictionary[x] = [index]
    return (index_dictionary)

print(reverse_index(dataset))

您的代码几乎可以正常工作,只是有一个小的缩进错误 - 您应该有一个嵌套的 for 循环,因为您希望为数据集中的每个句子更新词表:

def reverse_index(dataset):

    index_dictionary = {}

    for index in range(len(dataset)):
        phrase = dataset[index]
        words = phrase.lower()
        wordlist = words.split()

        for x in wordlist:
            if x in index_dictionary.keys():
                index_dictionary[x].append(index)
            else:
                index_dictionary[x] = [index]
    return (index_dictionary)