scrapy 图像管道对象没有属性 'spiderinfo' 错误
scrapy image pipeline object has no attribute 'spiderinfo' error
我刚刚遇到了 scrupy 图像管道的问题,背景是我只是想下载一些图片用于 scrupy 图像管道的测试目的,但是在编写代码后,我遇到了一个错误 运行剧本
错误信息:
2021-12-31 17:07:12 [scrapy.core.scraper] ERROR: Error processing {'image_urls': ['https://pic01.jituwang.com/190328/256612-1Z32Q1453861-lp.jpg']}
Traceback (most recent call last):
File "e:\microsoft visual studio\shared\python37_64\lib\site-packages\twisted\internet\defer.py", line 859, in _runCallbacks
current.result, *args, **kwargs
File "e:\microsoft visual studio\shared\python37_64\lib\site-packages\scrapy\utils\defer.py", line 150, in f
return deferred_from_coro(coro_f(*coro_args, **coro_kwargs))
File "e:\microsoft visual studio\shared\python37_64\lib\site-packages\scrapy\pipelines\media.py", line 86, in process_item
info = self.spiderinfo
AttributeError: 'ImgPileine' object has no attribute 'spiderinfo'
命令 运行 脚本:
scrapy crawl picspider
scrupy 版本:
刮擦 2.5.1
下面是代码
#spider.py
import scrapy
from picscrapydemo.items import PicscrapydemoItem
class PicspiderSpider(scrapy.Spider):
name = 'picspider'
start_urls = ['http://www.jituwang.com/bizhi/qiche/']
def parse(self, response):
pics_src = response.xpath('//div[@class="anPic"]//img/@src').extract()
for pic_src in pics_src:
item=PicscrapydemoItem()
item['image_urls']=[pic_src]
# print(pic_src)
yield item
#pipelines.py
from itemadapter import ItemAdapter
from scrapy.pipelines.images import ImagesPipeline
import scrapy
class ImgPileine(ImagesPipeline):
def open_spider(self, spider):
print('start')
def get_media_requests(self, item, info):
for image_url in item['image_urls']:
yield scrapy.Request(image_url)
def file_path(self, request, response=None, info=None, *, item=None):
url=request.url
file_name=url.split('/')[-1]
return file_name
def item_completed(self, results, item, info):
return item
问题出在函数 open_spider
。
替换为:
def open_spider(self, spider):
self.spiderinfo = self.SpiderInfo(spider)
print('start')
我刚刚遇到了 scrupy 图像管道的问题,背景是我只是想下载一些图片用于 scrupy 图像管道的测试目的,但是在编写代码后,我遇到了一个错误 运行剧本
错误信息:
2021-12-31 17:07:12 [scrapy.core.scraper] ERROR: Error processing {'image_urls': ['https://pic01.jituwang.com/190328/256612-1Z32Q1453861-lp.jpg']}
Traceback (most recent call last):
File "e:\microsoft visual studio\shared\python37_64\lib\site-packages\twisted\internet\defer.py", line 859, in _runCallbacks
current.result, *args, **kwargs
File "e:\microsoft visual studio\shared\python37_64\lib\site-packages\scrapy\utils\defer.py", line 150, in f
return deferred_from_coro(coro_f(*coro_args, **coro_kwargs))
File "e:\microsoft visual studio\shared\python37_64\lib\site-packages\scrapy\pipelines\media.py", line 86, in process_item
info = self.spiderinfo
AttributeError: 'ImgPileine' object has no attribute 'spiderinfo'
命令 运行 脚本:
scrapy crawl picspider
scrupy 版本:
刮擦 2.5.1
下面是代码
#spider.py
import scrapy
from picscrapydemo.items import PicscrapydemoItem
class PicspiderSpider(scrapy.Spider):
name = 'picspider'
start_urls = ['http://www.jituwang.com/bizhi/qiche/']
def parse(self, response):
pics_src = response.xpath('//div[@class="anPic"]//img/@src').extract()
for pic_src in pics_src:
item=PicscrapydemoItem()
item['image_urls']=[pic_src]
# print(pic_src)
yield item
#pipelines.py
from itemadapter import ItemAdapter
from scrapy.pipelines.images import ImagesPipeline
import scrapy
class ImgPileine(ImagesPipeline):
def open_spider(self, spider):
print('start')
def get_media_requests(self, item, info):
for image_url in item['image_urls']:
yield scrapy.Request(image_url)
def file_path(self, request, response=None, info=None, *, item=None):
url=request.url
file_name=url.split('/')[-1]
return file_name
def item_completed(self, results, item, info):
return item
问题出在函数 open_spider
。
替换为:
def open_spider(self, spider):
self.spiderinfo = self.SpiderInfo(spider)
print('start')