AttributeError: can't set attribute from nltk.book import *
AttributeError: can't set attribute from nltk.book import *
安装 nltk 后我导入 nltk 然后使用 nltk.download() 但是当我尝试使用这个 "from nltk.book import *" 它显示属性错误。
from nltk.corpus import * 和 from nltk import * 工作正常
我是自然语言处理的新手,所以我对此了解不多请帮忙
来自 nltk.book 导入 *
* NLTK 书籍的介绍性示例 *
正在加载 text1, ..., text9 和 sent1, ..., sent9
输入文本或句子的名称进行查看。
键入:'texts()' 或 'sents()' 以列出材料。
回溯(最近调用最后):
中的文件“”,第 1 行
from nltk.book import *
文件"C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\book.py",行
19,在
text1 = Text(gutenberg.words('melville-moby_dick.txt'))
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\text.py",第 295 行,在 init
中
tokens = list(tokens)
文件“C:\Program Files (x86)\Python 3.5\lib\site-
packages\nltk\corpus\reader\util.py",第 233 行,在 len
中
for tok in self.iterate_from(self._toknum[-1]): pass
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\corpus\reader\util.py",第 291 行,在 iterate_from
中
tokens = self.read_block(self._stream)
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\corpus\reader\plaintext.py",第 117 行,在 _read_word_block 中
words.extend(self._word_tokenizer.tokenize(stream.readline()))
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\tokenize\regexp.py",第 126 行,标记化
self._check_regexp()
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\tokenize\regexp.py",第 121 行,在 _check_regexp 中
self._regexp = compile_regexp_to_noncapturing(self._pattern, self._flags)
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\internals.py",第 56 行,在 compile_regexp_to_noncapturing 中
return sre_compile.compile(convert_regexp_to_noncapturing_parsed(sre_parse.parse(模式)), flags=flags)
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\internals.py",第 52 行,在 convert_regexp_to_noncapturing_parsed 中
parsed_pattern.pattern.groups = 1
AttributeError: 无法设置属性
我不确定您是否解决了我们的问题。
为了以防万一,这里也报告了同样的问题:
https://github.com/nltk/nltk/issues/1135
解决方法:
https://github.com/nltk/nltk/issues/1106
"I was able to fix this problem by going into the internals.py
file in the nltk
directory and removing the line parsed_pattern.pattern.groups = 1
. My rationale behind this was, after doing a bit of code reading, the original version of sre_parse.py
that NLTK was designed to work stored groups
as an attribute of an instance of the sre_parse.Pattern
class. The version that comes with Python 3.5 stores groups
as a property which returns (I'm not too familiar with properties, but this is what I presume it does) the length of a subpattern
list. The code I'm talking about is here at about line 75. What I don't know is what the long term effects of doing this will be, I came up with this solution just by tracing through the code, I haven't looked at what bugs this may cause in the long run. Someone please tell me if this would cause problems and if there's a better solution."
以上对我来说到目前为止没有任何问题。
安装 nltk 后我导入 nltk 然后使用 nltk.download() 但是当我尝试使用这个 "from nltk.book import *" 它显示属性错误。 from nltk.corpus import * 和 from nltk import * 工作正常
我是自然语言处理的新手,所以我对此了解不多请帮忙
来自 nltk.book 导入 * * NLTK 书籍的介绍性示例 *
正在加载 text1, ..., text9 和 sent1, ..., sent9
输入文本或句子的名称进行查看。
键入:'texts()' 或 'sents()' 以列出材料。
回溯(最近调用最后):
中的文件“”,第 1 行from nltk.book import *
文件"C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\book.py",行 19,在
text1 = Text(gutenberg.words('melville-moby_dick.txt'))
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\text.py",第 295 行,在 init
中tokens = list(tokens)
文件“C:\Program Files (x86)\Python 3.5\lib\site-
packages\nltk\corpus\reader\util.py",第 233 行,在 len
中for tok in self.iterate_from(self._toknum[-1]): pass
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\corpus\reader\util.py",第 291 行,在 iterate_from
中tokens = self.read_block(self._stream)
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\corpus\reader\plaintext.py",第 117 行,在 _read_word_block 中 words.extend(self._word_tokenizer.tokenize(stream.readline()))
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\tokenize\regexp.py",第 126 行,标记化 self._check_regexp()
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\tokenize\regexp.py",第 121 行,在 _check_regexp 中 self._regexp = compile_regexp_to_noncapturing(self._pattern, self._flags)
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\internals.py",第 56 行,在 compile_regexp_to_noncapturing 中 return sre_compile.compile(convert_regexp_to_noncapturing_parsed(sre_parse.parse(模式)), flags=flags)
文件 "C:\Program Files (x86)\Python 3.5\lib\site-packages\nltk\internals.py",第 52 行,在 convert_regexp_to_noncapturing_parsed 中 parsed_pattern.pattern.groups = 1
AttributeError: 无法设置属性
我不确定您是否解决了我们的问题。 为了以防万一,这里也报告了同样的问题: https://github.com/nltk/nltk/issues/1135
解决方法: https://github.com/nltk/nltk/issues/1106
"I was able to fix this problem by going into the internals.py
file in the nltk
directory and removing the line parsed_pattern.pattern.groups = 1
. My rationale behind this was, after doing a bit of code reading, the original version of sre_parse.py
that NLTK was designed to work stored groups
as an attribute of an instance of the sre_parse.Pattern
class. The version that comes with Python 3.5 stores groups
as a property which returns (I'm not too familiar with properties, but this is what I presume it does) the length of a subpattern
list. The code I'm talking about is here at about line 75. What I don't know is what the long term effects of doing this will be, I came up with this solution just by tracing through the code, I haven't looked at what bugs this may cause in the long run. Someone please tell me if this would cause problems and if there's a better solution."
以上对我来说到目前为止没有任何问题。