Scrapy 不提供 xpath 选择器的输出
Scrapy is provides no output with xpath selector
这是我试图在 scrapy shell 中 运行 从 dailymail.co.uk 获取文章标题的代码。
headline = response.xpath("//div[@id='js-article-text']/h2/text()").extract()
$ scrapy shell "https://www.dailymail.co.uk/tvshowbiz/article-8257569/Shia-LaBeouf-revealed-heavily-tattoo-torso-goes-shirtless-run-hot-pink-shorts.html"
根据您的请求设置一个用户代理,它应该可以工作:
scrapy shell -s USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:52.0) Gecko/20100101 Firefox/52.0" "https://www.dailymail.co.uk/tvshowbiz/article-8257569/Shia-LaBeouf-revealed-heavily-tattoo-torso-goes-shirtless-run-hot-pink-shorts.html"
response.xpath("//div[@id='js-article-text']/h2/text()").extract()
输出:
Shia LaBeouf reveals his heavily tattoo torso as he goes shirtless for a run in hot pink shorts
这是我试图在 scrapy shell 中 运行 从 dailymail.co.uk 获取文章标题的代码。
headline = response.xpath("//div[@id='js-article-text']/h2/text()").extract()
$ scrapy shell "https://www.dailymail.co.uk/tvshowbiz/article-8257569/Shia-LaBeouf-revealed-heavily-tattoo-torso-goes-shirtless-run-hot-pink-shorts.html"
根据您的请求设置一个用户代理,它应该可以工作:
scrapy shell -s USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:52.0) Gecko/20100101 Firefox/52.0" "https://www.dailymail.co.uk/tvshowbiz/article-8257569/Shia-LaBeouf-revealed-heavily-tattoo-torso-goes-shirtless-run-hot-pink-shorts.html"
response.xpath("//div[@id='js-article-text']/h2/text()").extract()
输出:
Shia LaBeouf reveals his heavily tattoo torso as he goes shirtless for a run in hot pink shorts