在 scrapy 上找到合适的选择器 css 来抓取网页
Find the right selector css to crawl a webpage on scrapy
我正在尝试抓取此网页“https://www.woolworths.com.au/shop/browse/drinks/cordials-juices-iced-teas/iced-teas”以提取产品名称,但我找不到合适的选择器,即使是价格、h1 或标题!我试过了:
response.css(".shelfProductTile-descriptionLink") #for the name product
response.css(".price-cents") # for the price
response.css(".tileList-title") # for the title
我该如何继续?
内容是从 POST xhr 动态加载的,返回 json 您可以在浏览器的网络选项卡中找到。
请求转到:
https://www.woolworths.com.au/apis/ui/browse/category
有效载荷:
{"categoryId":"1_9573995","pageNumber":1,"pageSize":24,"sortType":"TraderRelevance","url":"/shop/browse/drinks/cordials-juices-iced-teas/iced-teas","location":"/shop/browse/drinks/cordials-juices-iced-teas/iced-teas","formatObject":"{\"name\":\"Iced Teas\"}","isSpecial":False,"isBundle":False,"isMobile":False,"filters":"null"}
在 scrapy 使用中有响应:
json.loads(response.body_as_unicode())
我正在尝试抓取此网页“https://www.woolworths.com.au/shop/browse/drinks/cordials-juices-iced-teas/iced-teas”以提取产品名称,但我找不到合适的选择器,即使是价格、h1 或标题!我试过了:
response.css(".shelfProductTile-descriptionLink") #for the name product
response.css(".price-cents") # for the price
response.css(".tileList-title") # for the title
我该如何继续?
内容是从 POST xhr 动态加载的,返回 json 您可以在浏览器的网络选项卡中找到。
请求转到:
https://www.woolworths.com.au/apis/ui/browse/category
有效载荷:
{"categoryId":"1_9573995","pageNumber":1,"pageSize":24,"sortType":"TraderRelevance","url":"/shop/browse/drinks/cordials-juices-iced-teas/iced-teas","location":"/shop/browse/drinks/cordials-juices-iced-teas/iced-teas","formatObject":"{\"name\":\"Iced Teas\"}","isSpecial":False,"isBundle":False,"isMobile":False,"filters":"null"}
在 scrapy 使用中有响应:
json.loads(response.body_as_unicode())