Scrapy 亚马逊分页前几页
Scrapy Amazon Pagination First Few Pages
目前使用Scrapy在亚马逊数据抓取器中进行分页,我正在使用
next_page = response.xpath('//li[@class="a-last"]/a/@href').get()
if next_page:
next_page = 'https://www.amazon.com' + next_page
yield scrapy.Request(url=next_page,callback=self.parse,headers=self.amazon_header,dont_filter=True)
假设我只想从前 3 页获取数据,我该怎么做?
转到 settings.py 文件并按如下方式限制分页:
CLOSESPIDER_PAGECOUNT = 3
替代方案:
认为,
url =[ 'https:// www.quote.toscrape/page=1 something']
现在在start_urls中这样分页并排除下一个
分页
start_urls =[ 'https:// www.quote.toscrape/page='+str(x)+' something' for x in range(1,3)]
目前使用Scrapy在亚马逊数据抓取器中进行分页,我正在使用
next_page = response.xpath('//li[@class="a-last"]/a/@href').get()
if next_page:
next_page = 'https://www.amazon.com' + next_page
yield scrapy.Request(url=next_page,callback=self.parse,headers=self.amazon_header,dont_filter=True)
假设我只想从前 3 页获取数据,我该怎么做?
转到 settings.py 文件并按如下方式限制分页:
CLOSESPIDER_PAGECOUNT = 3
替代方案: 认为, url =[ 'https:// www.quote.toscrape/page=1 something']
现在在start_urls中这样分页并排除下一个 分页
start_urls =[ 'https:// www.quote.toscrape/page='+str(x)+' something' for x in range(1,3)]