Scrapy 总是 运行 来自命令提示符的相同命令
Scrapy always running same command from command prompt
我正在尝试在 Windows 10 上学习 BashOnUbunty 上的 Scrapy。我使用 genspider 命令创建了一个蜘蛛(yelprest),然后通过创建spider文件(按照官方教程https://doc.scrapy.org/en/latest/intro/tutorial.html)。
第一个蜘蛛尚未测试,但我尝试用第二个蜘蛛完成教程,当我尝试 运行 时,我收到指向第一个蜘蛛的错误。此外,当我尝试 运行 任何其他 scrapy 命令(如 version)时,我遇到与上述相同的错误。以下是错误:
(BashEnv) root > scrapy version
Traceback (most recent call last):
File "/mnt/s/BashEnv/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 148, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 243, in __init__
super(CrawlerProcess, self).__init__(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 134, in __init__
self.spider_loader = _get_spider_loader(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 330, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings
return cls(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 25, in __init__
self._load_all_spiders()
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
for module in walk_modules(name):
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/mnt/s/BashEnv/Scrapy/Scrapy/spiders/yelprest.py", line 14
rules = (
^
IndentationError: unexpected indent
(BashEnv) root >
我不明白为什么我给出的任何命令都会出现同样的错误。
您的 yelprest.py
文件中有一些错误(第 14 行或之前):它无效 Python。修复此错误,一切都会正常。确保您的文件正确缩进并且不要混用空格和制表符。
编辑:
为确保错误在此文件中,只需将其删除即可。如果没有这个文件一切正常,错误一定在那里!
更新:
你的问题没有说清楚,但是根据你的评论你的问题是"why does Scrapy load my spider code for every command?"。答案是:因为 Scrapy 就是为此而生的。有些命令只能在项目中 运行,例如 check
或 crawl
。有些命令可能在任何地方都是 运行,比如 startproject
。但是在 Scrapy 项目中,任何命令都会加载你所有的代码。 Scrapy就是这样制作的
例如,我有一个名为crawler
的项目(我知道,非常具有描述性!):
$ cd ~
$ scrapy version
Scrapy 1.4.0
$ cd crawler/
$ scrapy version
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: crawler)
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Overridden settings: {...}
Scrapy 1.4.0
我正在尝试在 Windows 10 上学习 BashOnUbunty 上的 Scrapy。我使用 genspider 命令创建了一个蜘蛛(yelprest),然后通过创建spider文件(按照官方教程https://doc.scrapy.org/en/latest/intro/tutorial.html)。
第一个蜘蛛尚未测试,但我尝试用第二个蜘蛛完成教程,当我尝试 运行 时,我收到指向第一个蜘蛛的错误。此外,当我尝试 运行 任何其他 scrapy 命令(如 version)时,我遇到与上述相同的错误。以下是错误:
(BashEnv) root > scrapy version
Traceback (most recent call last):
File "/mnt/s/BashEnv/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 148, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 243, in __init__
super(CrawlerProcess, self).__init__(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 134, in __init__
self.spider_loader = _get_spider_loader(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 330, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings
return cls(settings)
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 25, in __init__
self._load_all_spiders()
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
for module in walk_modules(name):
File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/mnt/s/BashEnv/Scrapy/Scrapy/spiders/yelprest.py", line 14
rules = (
^
IndentationError: unexpected indent
(BashEnv) root >
我不明白为什么我给出的任何命令都会出现同样的错误。
您的 yelprest.py
文件中有一些错误(第 14 行或之前):它无效 Python。修复此错误,一切都会正常。确保您的文件正确缩进并且不要混用空格和制表符。
编辑:
为确保错误在此文件中,只需将其删除即可。如果没有这个文件一切正常,错误一定在那里!
更新:
你的问题没有说清楚,但是根据你的评论你的问题是"why does Scrapy load my spider code for every command?"。答案是:因为 Scrapy 就是为此而生的。有些命令只能在项目中 运行,例如 check
或 crawl
。有些命令可能在任何地方都是 运行,比如 startproject
。但是在 Scrapy 项目中,任何命令都会加载你所有的代码。 Scrapy就是这样制作的
例如,我有一个名为crawler
的项目(我知道,非常具有描述性!):
$ cd ~
$ scrapy version
Scrapy 1.4.0
$ cd crawler/
$ scrapy version
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: crawler)
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Overridden settings: {...}
Scrapy 1.4.0