Python 如何打印单词在哪个句子中?

Python how can I print which sentence a word is in?

制作索引程序。我想让它告诉我一个词在哪个句子中,所以如果我有:

"Hello world. My name is Nathan and I need help on Python. I am very confused and any help is appreciated."

我想让它打印每个单词来自哪个句子。我已经完成了它计算每个单词出现的总次数,在它旁边我需要它来自的句子编号,所以它显示为:

a. word {word appearance count:sentence number}

和 'a.' 作为列表顺序(类似于编号列表但带有字母)。第一句话中的一个例子是

a. help {2:2,3}

这是我目前拥有的代码:

word_counter = {}
sent_num = {}
linenum = 0
wordnum = 0
counter = 0

#not working
for word in f.lower().split('.'):
    if not word in sent_num:
        sent_num[word] = []
    sent_num[word].append(f.find(wordnum))


#working correctly
for word in f.lower().split():
if not word in word_counter:
        word_counter[word] = []
        #if the word isn't listed yet, adds it
    word_counter[word].append(linenum)

for key in sorted(word_counter):
    counter += 1
    print (counter, key, len(word_counter[key]), len(sent_num[key]))

在您的代码中,当您查看完整的句子时,您只会在 '.' 上拆分。您需要将每个句子拆分成单词,然后:

for sentence in f.split('.'):
    for word in sentence.lower().split():
        if not word in sent_num:
            sent_num[word] = []
        sent_num[word].append(f.find(wordnum))

或者沿着这些方向的东西,取决于你想看什么和数什么。

遍历每个句子然后遍历该句子中的每个单词并创建映射 {word: [sentence, ...]}:

的字典非常简单
In [1]:
d = {}
for i, sent in enumerate(f.lower().split('. ')):
    for w in sent.strip().split():
        d.setdefault(w, []).append(i)
d

Out[1]:
{'am': [2],
 'and': [1, 2],
 'any': [2],
 'appreciated.': [2],
 'confused': [2],
 'hello': [0],
 'help': [1, 2],
 ...}

鉴于列表是单词的所有出现次数,那么您可以通过调用 len() 来获取计数,例如:

In [2]:
len(d['help'])

Out[2]:
2