如何拆分列名并使用wordnet查找字典含义?
How to split up column name and find dictionary meaning with wordnet?
我有以下数据,我正在尝试获取字典定义,但只有当它是单个单词时才有效。我怎样才能让它适用于多个单词?
代码:
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet
columns = ['Sector',
'Community Name',
'Date',
'Community Center Point']
tmp = []
for x in columns:
syns = (wordnet.synsets(x))
tmp.append(syns[0].definition() if len(syns) > 0 else '')
输出:
pd.DataFrame(tmp).T
0 1 2 3
a plane figure bounded by the specified day of the month
第 1 列和第 3 列为空,因为 'Community Name' 和 'Community Center Point' 包含多个单词。
期望的输出:
0 1 2 3
Sector: [a plane figure...] Community: [definition], Name: [definition] Date: [the specified day of the month] Community: [definition], Center: [definition], Point: [definition]
from nltk.corpus import wordnet
columns = ['Sector', 'Community Name', 'Date', 'Community Center Point']
col_defs = []
for item in columns:
tmp = []
for word in item.split():
syns = (wordnet.synsets(word))
tmp.append(word+': '+syns[0].definition() if len(syns) > 0 else None)
col_defs.append(', '.join(tmp))
for x in col_defs:
print(x)
输出:
Sector: a plane figure bounded by two radii and the included arc of a circle
Community: a group of people living in a particular local area, Name: a language unit by which a person or thing is known
Date: the specified day of the month
Community: a group of people living in a particular local area, Center: an area that is approximately central within some larger region, Point: a geometric element that has position but no extension
我有以下数据,我正在尝试获取字典定义,但只有当它是单个单词时才有效。我怎样才能让它适用于多个单词?
代码:
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet
columns = ['Sector',
'Community Name',
'Date',
'Community Center Point']
tmp = []
for x in columns:
syns = (wordnet.synsets(x))
tmp.append(syns[0].definition() if len(syns) > 0 else '')
输出:
pd.DataFrame(tmp).T
0 1 2 3
a plane figure bounded by the specified day of the month
第 1 列和第 3 列为空,因为 'Community Name' 和 'Community Center Point' 包含多个单词。
期望的输出:
0 1 2 3
Sector: [a plane figure...] Community: [definition], Name: [definition] Date: [the specified day of the month] Community: [definition], Center: [definition], Point: [definition]
from nltk.corpus import wordnet
columns = ['Sector', 'Community Name', 'Date', 'Community Center Point']
col_defs = []
for item in columns:
tmp = []
for word in item.split():
syns = (wordnet.synsets(word))
tmp.append(word+': '+syns[0].definition() if len(syns) > 0 else None)
col_defs.append(', '.join(tmp))
for x in col_defs:
print(x)
输出:
Sector: a plane figure bounded by two radii and the included arc of a circle
Community: a group of people living in a particular local area, Name: a language unit by which a person or thing is known
Date: the specified day of the month
Community: a group of people living in a particular local area, Center: an area that is approximately central within some larger region, Point: a geometric element that has position but no extension