Python 如何打印单词在哪个句子中?
Python how can I print which sentence a word is in?
制作索引程序。我想让它告诉我一个词在哪个句子中,所以如果我有:
"Hello world. My name is Nathan and I need help on Python. I am very confused and any help is appreciated."
我想让它打印每个单词来自哪个句子。我已经完成了它计算每个单词出现的总次数,在它旁边我需要它来自的句子编号,所以它显示为:
a. word {word appearance count:sentence number}
和 'a.' 作为列表顺序(类似于编号列表但带有字母)。第一句话中的一个例子是
a. help {2:2,3}
这是我目前拥有的代码:
word_counter = {}
sent_num = {}
linenum = 0
wordnum = 0
counter = 0
#not working
for word in f.lower().split('.'):
if not word in sent_num:
sent_num[word] = []
sent_num[word].append(f.find(wordnum))
#working correctly
for word in f.lower().split():
if not word in word_counter:
word_counter[word] = []
#if the word isn't listed yet, adds it
word_counter[word].append(linenum)
for key in sorted(word_counter):
counter += 1
print (counter, key, len(word_counter[key]), len(sent_num[key]))
在您的代码中,当您查看完整的句子时,您只会在 '.'
上拆分。您需要将每个句子拆分成单词,然后:
for sentence in f.split('.'):
for word in sentence.lower().split():
if not word in sent_num:
sent_num[word] = []
sent_num[word].append(f.find(wordnum))
或者沿着这些方向的东西,取决于你想看什么和数什么。
遍历每个句子然后遍历该句子中的每个单词并创建映射 {word: [sentence, ...]}
:
的字典非常简单
In [1]:
d = {}
for i, sent in enumerate(f.lower().split('. ')):
for w in sent.strip().split():
d.setdefault(w, []).append(i)
d
Out[1]:
{'am': [2],
'and': [1, 2],
'any': [2],
'appreciated.': [2],
'confused': [2],
'hello': [0],
'help': [1, 2],
...}
鉴于列表是单词的所有出现次数,那么您可以通过调用 len()
来获取计数,例如:
In [2]:
len(d['help'])
Out[2]:
2
制作索引程序。我想让它告诉我一个词在哪个句子中,所以如果我有:
"Hello world. My name is Nathan and I need help on Python. I am very confused and any help is appreciated."
我想让它打印每个单词来自哪个句子。我已经完成了它计算每个单词出现的总次数,在它旁边我需要它来自的句子编号,所以它显示为:
a. word {word appearance count:sentence number}
和 'a.' 作为列表顺序(类似于编号列表但带有字母)。第一句话中的一个例子是
a. help {2:2,3}
这是我目前拥有的代码:
word_counter = {}
sent_num = {}
linenum = 0
wordnum = 0
counter = 0
#not working
for word in f.lower().split('.'):
if not word in sent_num:
sent_num[word] = []
sent_num[word].append(f.find(wordnum))
#working correctly
for word in f.lower().split():
if not word in word_counter:
word_counter[word] = []
#if the word isn't listed yet, adds it
word_counter[word].append(linenum)
for key in sorted(word_counter):
counter += 1
print (counter, key, len(word_counter[key]), len(sent_num[key]))
在您的代码中,当您查看完整的句子时,您只会在 '.'
上拆分。您需要将每个句子拆分成单词,然后:
for sentence in f.split('.'):
for word in sentence.lower().split():
if not word in sent_num:
sent_num[word] = []
sent_num[word].append(f.find(wordnum))
或者沿着这些方向的东西,取决于你想看什么和数什么。
遍历每个句子然后遍历该句子中的每个单词并创建映射 {word: [sentence, ...]}
:
In [1]:
d = {}
for i, sent in enumerate(f.lower().split('. ')):
for w in sent.strip().split():
d.setdefault(w, []).append(i)
d
Out[1]:
{'am': [2],
'and': [1, 2],
'any': [2],
'appreciated.': [2],
'confused': [2],
'hello': [0],
'help': [1, 2],
...}
鉴于列表是单词的所有出现次数,那么您可以通过调用 len()
来获取计数,例如:
In [2]:
len(d['help'])
Out[2]:
2