匹配特定单词前的最后一个名词
Match the last noun before a particular word
我正在使用 Python 并想匹配 "needing" 之前的最后一个名词。
text = "Charles and Kim are needing a good hot dog"
我必须使用 re.findall 和 nltk
我尝试了 follow 但它显示了之前的所有信息,我只需要最后一个名词
post = re.findall(r'.*needing', text)[0]
希望得到
Kim
只需使用 nltk 中的 POS 标记。
您需要下载一些 nltk 资源,然后只需标记并找到您想要的。这段代码可以做到:
import nltk
# You'll need to run these two resource downloads the first time you do this.
# So uncomment the following two lines
# nltk.download('punkt')
# nltk.download('averaged_perceptron_tagger')
text = "Charles and Kim are needing a good hot dog"
tokens = nltk.word_tokenize(text)
tags = nltk.pos_tag(tokens)
# You are interested in splitting the sentence here
sentence_split = tokens.index("needing")
# Find the words where tag meets your criteria (must be a noun / proper noun)
nouns_before_split = [word for (word, tag) in tags[:sentence_split] if tag.startswith('NN')]
# Show the result. The last such noun
print(nouns_before_split[-1])
我正在使用 Python 并想匹配 "needing" 之前的最后一个名词。
text = "Charles and Kim are needing a good hot dog"
我必须使用 re.findall 和 nltk
我尝试了 follow 但它显示了之前的所有信息,我只需要最后一个名词
post = re.findall(r'.*needing', text)[0]
希望得到
Kim
只需使用 nltk 中的 POS 标记。
您需要下载一些 nltk 资源,然后只需标记并找到您想要的。这段代码可以做到:
import nltk
# You'll need to run these two resource downloads the first time you do this.
# So uncomment the following two lines
# nltk.download('punkt')
# nltk.download('averaged_perceptron_tagger')
text = "Charles and Kim are needing a good hot dog"
tokens = nltk.word_tokenize(text)
tags = nltk.pos_tag(tokens)
# You are interested in splitting the sentence here
sentence_split = tokens.index("needing")
# Find the words where tag meets your criteria (must be a noun / proper noun)
nouns_before_split = [word for (word, tag) in tags[:sentence_split] if tag.startswith('NN')]
# Show the result. The last such noun
print(nouns_before_split[-1])