我需要在 python 中执行词干提取操作，而无需 nltk 。使用管道方法

Question

我有一个单词列表和一个词干规则列表。我需要阻止它们的后缀在词干规则中的词 list.I 从朋友那里得到提示我可以使用管道方法

例如，如果我有： stem=['less','ship','ing','les','ly','es','s'] text=['friends','friendly','keeping','friendship']

我应该得到：'friend','friend','keep',朋友'

Answer 1

您可以使用正则表达式查找和编辑模式（重新打包）

import re

text = ['friends', 'friendly', 'keeping', 'friendship']
stems = [
    # next line finds patterns and remove them from the string.
    re.sub(r'less|ship|ing|les|ly|es|s', '', word) 
    for word in text
]

print(stems)

我需要在 python 中执行词干提取操作，而无需 nltk 。使用管道方法

I need to perform a Stemming operation in python ,without nltk . Using pipeline methods

pipeline

stemming