Scrapy 没有在我的目录中写入文件,所以我看不到
Scrapy doesn't write file in my directory so I can't see that
嘿,我正在尝试抓取网站并获取其标题,但它不会写入我需要的文本文件。
它不会给我任何错误,我通过从 cmd 等创建新文件来完成所有工作
这是我的代码:
import scrapy
class My_spider(scrapy.Spider):
name= "book_crawler"
def start_request(self):
url_list=[]
for i in range(2 , 5):
url_list.append("https://books.toscrape.com/catalogue/page-" + str(i) + ".html")
urls=[]
urls.append("https://books.toscrape.com/")
for i in range(0 , len(url_list)):
urls.append(url_list[i])
for url in urls:
yield scrapy.request(url= url, callback= self.parse)
def parse(self,response):
title_list=response.xpath("article[@class='product_pod']/h3/a/text()").extract()
with open('book_titel.txt' , 'a+') as f:
for i in range(0, len(title_list)) :
f.write(str(i)+ " : " + title_list[i] + "\n")
我在您的代码中发现了 3 个错误。使用 start_requests
而不是 start_request,使用 scrapy.Request
而不是 scrapy.Request,最后在你的 xpath 中。
class My_spider(scrapy.Spider):
name= "book_crawler"
def start_requests(self):
url_list=[]
for i in range(2 , 5):
url_list.append("https://books.toscrape.com/catalogue/page-" + str(i) + ".html")
urls=[]
urls.append("https://books.toscrape.com/")
for i in range(0 , len(url_list)):
urls.append(url_list[i])
for url in urls:
yield scrapy.Request(url= url, callback= self.parse)
def parse(self,response):
title_list = response.css("ol.row").xpath('//h3/a/text()').extract()
print(title_list)
with open('book_titel.txt' , 'a+') as f:
for i in range(0, len(title_list)) :
f.write(str(i)+ " : " + title_list[i] + "\n")
嘿,我正在尝试抓取网站并获取其标题,但它不会写入我需要的文本文件。 它不会给我任何错误,我通过从 cmd 等创建新文件来完成所有工作
这是我的代码:
import scrapy
class My_spider(scrapy.Spider):
name= "book_crawler"
def start_request(self):
url_list=[]
for i in range(2 , 5):
url_list.append("https://books.toscrape.com/catalogue/page-" + str(i) + ".html")
urls=[]
urls.append("https://books.toscrape.com/")
for i in range(0 , len(url_list)):
urls.append(url_list[i])
for url in urls:
yield scrapy.request(url= url, callback= self.parse)
def parse(self,response):
title_list=response.xpath("article[@class='product_pod']/h3/a/text()").extract()
with open('book_titel.txt' , 'a+') as f:
for i in range(0, len(title_list)) :
f.write(str(i)+ " : " + title_list[i] + "\n")
我在您的代码中发现了 3 个错误。使用 start_requests
而不是 start_request,使用 scrapy.Request
而不是 scrapy.Request,最后在你的 xpath 中。
class My_spider(scrapy.Spider):
name= "book_crawler"
def start_requests(self):
url_list=[]
for i in range(2 , 5):
url_list.append("https://books.toscrape.com/catalogue/page-" + str(i) + ".html")
urls=[]
urls.append("https://books.toscrape.com/")
for i in range(0 , len(url_list)):
urls.append(url_list[i])
for url in urls:
yield scrapy.Request(url= url, callback= self.parse)
def parse(self,response):
title_list = response.css("ol.row").xpath('//h3/a/text()').extract()
print(title_list)
with open('book_titel.txt' , 'a+') as f:
for i in range(0, len(title_list)) :
f.write(str(i)+ " : " + title_list[i] + "\n")