Scrapy 总是 运行 来自命令提示符的相同命令

Scrapy always running same command from command prompt

我正在尝试在 Windows 10 上学习 BashOnUbunty 上的 Scrapy。我使用 genspider 命令创建了一个蜘蛛(yelprest),然后通过创建spider文件(按照官方教程https://doc.scrapy.org/en/latest/intro/tutorial.html)。

第一个蜘蛛尚未测试,但我尝试用第二个蜘蛛完成教程,当我尝试 运行 时,我收到指向第一个蜘蛛的错误。此外,当我尝试 运行 任何其他 scrapy 命令(如 version)时,我遇到与上述相同的错误。以下是错误:

(BashEnv) root > scrapy version
Traceback (most recent call last):
  File "/mnt/s/BashEnv/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 148, in execute
    cmd.crawler_process = CrawlerProcess(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 243, in __init__
    super(CrawlerProcess, self).__init__(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 134, in __init__
    self.spider_loader = _get_spider_loader(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 330, in _get_spider_loader
    return loader_cls.from_settings(settings.frozencopy())
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings
    return cls(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 25, in __init__
    self._load_all_spiders()
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
    for module in walk_modules(name):
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
    submod = import_module(fullpath)
  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/mnt/s/BashEnv/Scrapy/Scrapy/spiders/yelprest.py", line 14
    rules = (
    ^
IndentationError: unexpected indent
(BashEnv) root >

我不明白为什么我给出的任何命令都会出现同样的错误。

您的 yelprest.py 文件中有一些错误(第 14 行或之前):它无效 Python。修复此错误,一切都会正常。确保您的文件正确缩进并且不要混用空格和制表符。

编辑:

为确保错误在此文件中,只需将其删除即可。如果没有这个文件一切正常,错误一定在那里!


更新:

你的问题没有说清楚,但是根据你的评论你的问题是"why does Scrapy load my spider code for every command?"。答案是:因为 Scrapy 就是为此而生的。有些命令只能在项目中 运行,例如 checkcrawl。有些命令可能在任何地方都是 运行,比如 startproject。但是在 Scrapy 项目中,任何命令都会加载你所有的代码。 Scrapy就是这样制作的

例如,我有一个名为crawler的项目(我知道,非常具有描述性!):

$ cd ~
$ scrapy version
Scrapy 1.4.0
$ cd crawler/
$ scrapy version
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: crawler)
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Overridden settings: {...}
Scrapy 1.4.0