Scrapy: ImportError: No module named project_name.settings
Scrapy: ImportError: No module named project_name.settings
我正在尝试制作一个运行许多蜘蛛的脚本,但我得到 ImportError: No module named project_name.settings
我的脚本如下所示:
import os
os.system("scrapy crawl spider1")
os.system("scrapy crawl spider2")
....
os.system("scrapy crawl spiderN")
我的settings.py
# -*- coding: utf-8 -*-
# Scrapy settings for project_name
#
# For simplicity, this file contains only the most important settings by
# default. All the other settings are documented here:
#
# http://doc.scrapy.org/en/latest/topics/settings.html
#
BOT_NAME = 'project_name'
ITEM_PIPELINES = {
'project_name.pipelines.project_namePipelineToJSON': 300,
'project_name.pipelines.project_namePipelineToDB': 800
}
SPIDER_MODULES = ['project_name.spiders']
NEWSPIDER_MODULE = 'project_name.spiders'
# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'project_name (+http://www.yourdomain.com)'
我的蜘蛛看起来像任何普通蜘蛛,实际上非常简单...
import scrapy
from scrapy.crawler import CrawlerProcess
from Projectname.items import ProjectnameItem
class ProjectnameSpiderClass(scrapy.Spider):
name = "Projectname"
allowed_domains = ["Projectname.com"]
start_urls = ["...urls..."]
def parse(self, response):
item = ProjectnameItem()
我给了他们通用的名字,但你明白了,有没有办法解决这个错误?
2018 年编辑:
您需要 运行 项目文件夹中的爬虫,这意味着 os.system("scrapy crawl spider1")
必须 运行 来自包含 spider1
的文件夹。
或者你也可以像我以前那样,将所有代码放在一个文件中(旧答案,我不再推荐,但仍然有用且不错的解决方案)
Well, in case someone comes up to this question I finally used a heavily modified version of this https://gist.github.com/alecxe/fc1527d6d9492b59c610 provided by alexce in another question. Hope this helps.
我正在尝试制作一个运行许多蜘蛛的脚本,但我得到 ImportError: No module named project_name.settings
我的脚本如下所示:
import os
os.system("scrapy crawl spider1")
os.system("scrapy crawl spider2")
....
os.system("scrapy crawl spiderN")
我的settings.py
# -*- coding: utf-8 -*-
# Scrapy settings for project_name
#
# For simplicity, this file contains only the most important settings by
# default. All the other settings are documented here:
#
# http://doc.scrapy.org/en/latest/topics/settings.html
#
BOT_NAME = 'project_name'
ITEM_PIPELINES = {
'project_name.pipelines.project_namePipelineToJSON': 300,
'project_name.pipelines.project_namePipelineToDB': 800
}
SPIDER_MODULES = ['project_name.spiders']
NEWSPIDER_MODULE = 'project_name.spiders'
# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'project_name (+http://www.yourdomain.com)'
我的蜘蛛看起来像任何普通蜘蛛,实际上非常简单...
import scrapy
from scrapy.crawler import CrawlerProcess
from Projectname.items import ProjectnameItem
class ProjectnameSpiderClass(scrapy.Spider):
name = "Projectname"
allowed_domains = ["Projectname.com"]
start_urls = ["...urls..."]
def parse(self, response):
item = ProjectnameItem()
我给了他们通用的名字,但你明白了,有没有办法解决这个错误?
2018 年编辑:
您需要 运行 项目文件夹中的爬虫,这意味着 os.system("scrapy crawl spider1")
必须 运行 来自包含 spider1
的文件夹。
或者你也可以像我以前那样,将所有代码放在一个文件中(旧答案,我不再推荐,但仍然有用且不错的解决方案)
Well, in case someone comes up to this question I finally used a heavily modified version of this https://gist.github.com/alecxe/fc1527d6d9492b59c610 provided by alexce in another question. Hope this helps.