在没有 Panda 或 NumPy 的情况下对 Python 中的（小）数据集进行反向索引

Question

我是一个总计 python 的初学者，正在努力学习 python 课程，但我被这个问题难住了：我只是回来

{'python': [2], 'rules': [2]} 
#  (all words should be lowercase)

而不是整个集合，应该是：

{'python': [0, 2],'time': [0, 1],'it': [1],'is': [1],'that': [1],'rules':[2]}

如有任何帮助，我们将不胜感激！

from collections import defaultdict

dataset = [
    "Python time",
    "It is that TIME",
    "python rules"
 ] 

index_dictionary = {}

def reverse_index(dataset):
    
    for index in range(len(dataset)):
        phrase = dataset[index]
        words = phrase.lower()
        wordlist = words.split()

    for x in wordlist:
        if x in index_dictionary.keys():
            index_dictionary[x].append(index)
        else:
            index_dictionary[x] = [index]
    return (index_dictionary)

print(reverse_index(dataset))

Answer 1

您的代码几乎可以正常工作，只是有一个小的缩进错误 - 您应该有一个嵌套的 for 循环，因为您希望为数据集中的每个句子更新词表：

def reverse_index(dataset):

    index_dictionary = {}

    for index in range(len(dataset)):
        phrase = dataset[index]
        words = phrase.lower()
        wordlist = words.split()

        for x in wordlist:
            if x in index_dictionary.keys():
                index_dictionary[x].append(index)
            else:
                index_dictionary[x] = [index]
    return (index_dictionary)

在没有 Panda 或 NumPy 的情况下对 Python 中的（小）数据集进行反向索引

Reverse Indexing a (small) dataset in Python without Panda or NumPy

python

indexing

dictionary

dataset