如何在 Scrapy 中爬取部分数据稍后通过 ajax 获取的页面？

Question

我想抓取A页面，其中有一个值需要POST到B页面，获取结果追加到A页面的数据中

更具体地说，当用户单击屏蔽文本时，将显示 phone 数字。

在 Scrapy 中，如何在 parse 方法中启动另一个请求并将解析的数据附加到主数据？

Answer 1

一种简单的方法是在解析过程中使用 python requests。

import requests

class TestSpider(scrapy.Spider):

## your spider code
....
    def parse(self, response):
       URL = response.xpath("selector of URL").get()
       r = requeste.post(URL, data={"key" : value} )
       result = r.json().get("new_key") # if response is json othersiwse you have to parse the result. 
       data['new_key] = result
      yield data

但如果这是一个更复杂的问题，您可以使用 scrapy.Items 您可以在此处找到示例并阅读有关 Items here

的更多信息

如何在 Scrapy 中爬取部分数据稍后通过 ajax 获取的页面？

How to crawl a page in Scrapy where part of the data is fetched later through ajax?

python

web-crawler

scrapy