访问 semcor.tagged_sents() 中的树时出错
Error accessing tree in semcor.tagged_sents()
我正在使用来自 NLTK 的 semcor.tagged_sents()
模块。
nltk.download('semcor')
from nltk.corpus import semcor
Semcor.sents()
遍历表示为标记列表的所有句子:
print(semcor.sents()[0])
>>> ['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', 'Friday', 'an', 'investigation', 'of', 'Atlanta', "'s", 'recent', 'primary', 'election', 'produced', '``', 'no', 'evidence', "''", 'that', 'any', 'irregularities', 'took', 'place', '.']
并且 semcor.tagged_sents()
使用附加注释(包括 WordNet 词条标识符)迭代相同的句子。
semcor.tagged_sents(tag="sem")[0]
>>> [['The'],
Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])]),
Tree(Lemma('state.v.01.say'), ['said']),
Tree(Lemma('friday.n.01.Friday'), ['Friday']),
['an'],
Tree(Lemma('probe.n.01.investigation'), ['investigation']),
['of'],
Tree(Lemma('atlanta.n.01.Atlanta'), ['Atlanta']),
["'s"],
Tree(Lemma('late.s.03.recent'), ['recent']),
Tree(Lemma('primary.n.01.primary_election'), ['primary', 'election']),
Tree(Lemma('produce.v.04.produce'), ['produced']),
['``'],
['no'],
Tree(Lemma('evidence.n.01.evidence'), ['evidence']),
["''"],
['that'],
['any'],
Tree(Lemma('abnormality.n.04.irregularity'), ['irregularities']),
Tree(Lemma('happen.v.01.take_place'), ['took', 'place']),
['.']]
我的目标是创建一个函数,将来自 SemCor 的句子作为输入并提取一个列表,该列表对于句子的每个标记都包含相应的 WordNet 引理(例如引理('friday.n.01.Friday'))或 None.
现在,我想访问上面最后一个列表中的第二个元素 (Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])])
)。但是,当我 运行:
semcor.tagged_sents(tag="sem")[0][1]
我收到以下错误:
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\tree.py in _repr_png_(self)
805 env_vars=['PATH'],
--> 806 verbose=False,
807 )
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\internals.py in find_binary(name, path_to_bin, env_vars, searchpath, binary_names, url, verbose)
696 find_binary_iter(
--> 697 name, path_to_bin, env_vars, searchpath, binary_names, url, verbose
698 )
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\internals.py in find_binary_iter(name, path_to_bin, env_vars, searchpath, binary_names, url, verbose)
680 for file in find_file_iter(
--> 681 path_to_bin or name, env_vars, searchpath, binary_names, url, verbose
682 ):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\internals.py in find_file_iter(filename, env_vars, searchpath, file_names, url, verbose, finding_dir)
638 div = '=' * 75
--> 639 raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
640
LookupError:
===========================================================================
NLTK was unable to find the gs file!
Use software specific configuration paramaters or set the PATH environment variable.
===========================================================================
During handling of the above exception, another exception occurred:
LookupError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\formatters.py in __call__(self, obj)
343 method = get_real_method(obj, self.print_method)
344 if method is not None:
--> 345 return method()
346 return None
347 else:
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\tree.py in _repr_png_(self)
817 "https://docs.brew.sh/Installation then `brew install ghostscript`")
818 print(pre_error_message, file=sys.stderr)
--> 819 raise LookupError
820
821 with open(out_path, 'rb') as sr:
LookupError:
Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])])
然而,输出仍然是:
Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])])
这个 Lookuperror 是什么意思?并应采取行动?
My goal is to create a function that takes as input a sentence from
SemCor and extracts a list which contains, for each token of the
sentence, either the corresponding WordNet Lemma (e.g.
Lemma('friday.n.01.Friday')) or None.
def lemma_list(sent):
return [l.label() if isinstance(l, nltk.tree.Tree) else None for l in sent]
示例:
lemma_list(semcor.tagged_sents(tag="sem")[0])
#[None, 'group.n.01', 'say.v.01', 'friday.n.01', None, 'investigation.n.01', None, 'atlanta.n.01', None, 'recent.s.02', 'primary_election.n.01', 'produce.v.04', None, None, 'evidence.n.01', None, None, None, 'irregularity.n.01', 'take_place.v.01', None]
至于报错及其含义见NLTK was unable to find the gs file.
我正在使用来自 NLTK 的 semcor.tagged_sents()
模块。
nltk.download('semcor')
from nltk.corpus import semcor
Semcor.sents()
遍历表示为标记列表的所有句子:
print(semcor.sents()[0])
>>> ['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', 'Friday', 'an', 'investigation', 'of', 'Atlanta', "'s", 'recent', 'primary', 'election', 'produced', '``', 'no', 'evidence', "''", 'that', 'any', 'irregularities', 'took', 'place', '.']
并且 semcor.tagged_sents()
使用附加注释(包括 WordNet 词条标识符)迭代相同的句子。
semcor.tagged_sents(tag="sem")[0]
>>> [['The'],
Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])]),
Tree(Lemma('state.v.01.say'), ['said']),
Tree(Lemma('friday.n.01.Friday'), ['Friday']),
['an'],
Tree(Lemma('probe.n.01.investigation'), ['investigation']),
['of'],
Tree(Lemma('atlanta.n.01.Atlanta'), ['Atlanta']),
["'s"],
Tree(Lemma('late.s.03.recent'), ['recent']),
Tree(Lemma('primary.n.01.primary_election'), ['primary', 'election']),
Tree(Lemma('produce.v.04.produce'), ['produced']),
['``'],
['no'],
Tree(Lemma('evidence.n.01.evidence'), ['evidence']),
["''"],
['that'],
['any'],
Tree(Lemma('abnormality.n.04.irregularity'), ['irregularities']),
Tree(Lemma('happen.v.01.take_place'), ['took', 'place']),
['.']]
我的目标是创建一个函数,将来自 SemCor 的句子作为输入并提取一个列表,该列表对于句子的每个标记都包含相应的 WordNet 引理(例如引理('friday.n.01.Friday'))或 None.
现在,我想访问上面最后一个列表中的第二个元素 (Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])])
)。但是,当我 运行:
semcor.tagged_sents(tag="sem")[0][1]
我收到以下错误:
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\tree.py in _repr_png_(self)
805 env_vars=['PATH'],
--> 806 verbose=False,
807 )
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\internals.py in find_binary(name, path_to_bin, env_vars, searchpath, binary_names, url, verbose)
696 find_binary_iter(
--> 697 name, path_to_bin, env_vars, searchpath, binary_names, url, verbose
698 )
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\internals.py in find_binary_iter(name, path_to_bin, env_vars, searchpath, binary_names, url, verbose)
680 for file in find_file_iter(
--> 681 path_to_bin or name, env_vars, searchpath, binary_names, url, verbose
682 ):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\internals.py in find_file_iter(filename, env_vars, searchpath, file_names, url, verbose, finding_dir)
638 div = '=' * 75
--> 639 raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
640
LookupError:
===========================================================================
NLTK was unable to find the gs file!
Use software specific configuration paramaters or set the PATH environment variable.
===========================================================================
During handling of the above exception, another exception occurred:
LookupError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\formatters.py in __call__(self, obj)
343 method = get_real_method(obj, self.print_method)
344 if method is not None:
--> 345 return method()
346 return None
347 else:
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\tree.py in _repr_png_(self)
817 "https://docs.brew.sh/Installation then `brew install ghostscript`")
818 print(pre_error_message, file=sys.stderr)
--> 819 raise LookupError
820
821 with open(out_path, 'rb') as sr:
LookupError:
Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])])
然而,输出仍然是:
Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])])
这个 Lookuperror 是什么意思?并应采取行动?
My goal is to create a function that takes as input a sentence from SemCor and extracts a list which contains, for each token of the sentence, either the corresponding WordNet Lemma (e.g. Lemma('friday.n.01.Friday')) or None.
def lemma_list(sent):
return [l.label() if isinstance(l, nltk.tree.Tree) else None for l in sent]
示例:
lemma_list(semcor.tagged_sents(tag="sem")[0])
#[None, 'group.n.01', 'say.v.01', 'friday.n.01', None, 'investigation.n.01', None, 'atlanta.n.01', None, 'recent.s.02', 'primary_election.n.01', 'produce.v.04', None, None, 'evidence.n.01', None, None, None, 'irregularity.n.01', 'take_place.v.01', None]
至于报错及其含义见NLTK was unable to find the gs file.