用scrapy提取图片
Extracting images with scrapy
这是我第一次post在这个论坛上,如有遗漏,请见谅!
基本上,我在脚本方面取得了一些成功(如下所示),但是,它只返回一张图片(即页面上的最后一张图片)。
感谢任何帮助!
import scrapy
class KwikEKartSpider(scrapy.Spider):
name = "kek"
allowed_domains = ["https://www.tesco.com/groceries/en-GB/shop/fresh-food/all"]
start_urls = (
'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all',
)
def parse(self, response):
links = response.xpath("//img/@src")
html = ""
for link in links:
url = link.get()
if any(extension in url for extension in
[".jpg", ".png"]):
html += """<a href="{url}"
target="_blank">
<img src="{url}" height="33%"
width="33%"/>
<a/>""".format(url=url)
with open("frontpage.html", "a") as page:
page.write(html)
page.close()
这是一个缩进问题。
for 循环中唯一的一行是:
url = link.get()
您可能还想缩进其余代码。
这是我第一次post在这个论坛上,如有遗漏,请见谅!
基本上,我在脚本方面取得了一些成功(如下所示),但是,它只返回一张图片(即页面上的最后一张图片)。
感谢任何帮助!
import scrapy
class KwikEKartSpider(scrapy.Spider):
name = "kek"
allowed_domains = ["https://www.tesco.com/groceries/en-GB/shop/fresh-food/all"]
start_urls = (
'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all',
)
def parse(self, response):
links = response.xpath("//img/@src")
html = ""
for link in links:
url = link.get()
if any(extension in url for extension in
[".jpg", ".png"]):
html += """<a href="{url}"
target="_blank">
<img src="{url}" height="33%"
width="33%"/>
<a/>""".format(url=url)
with open("frontpage.html", "a") as page:
page.write(html)
page.close()
这是一个缩进问题。
for 循环中唯一的一行是:
url = link.get()
您可能还想缩进其余代码。