无法在 puppeteer 中使用 xpath 将 link 抓取到下一页

Failed to scrape the link to the next page using xpath in puppeteer

我正在尝试将 link 抓取到下一页 webpage. I know how to scrape that using css selector. However, things go wrong when I attempt to parse the same using xpath. This 是我得到的而不是下一页 link。

const puppeteer = require("puppeteer");
let url = "https://whosebug.com/questions/tagged/web-scraping";
 
(async () => {
    const browser = await puppeteer.launch({headless:false});
    const [page] = await browser.pages();
    
    await page.goto(url,{waitUntil: 'networkidle2'});
    let nextPageLink = await page.$x("//a[@rel='next']", item => item.getAttribute("href"));
    // let nextPageLink = await page.$eval("a[rel='next']", elm => elm.href);
    console.log("next page:",nextPageLink);
    await browser.close();
})();

How can I scrape the link to the next page using xpath?

  1. page.$x(expression) returns 元素句柄数组。您需要解构或索引访问才能从数组中获取第一个元素。
  2. 要从此元素句柄中获取 DOM 元素 属性,您需要使用元素句柄参数或元素句柄 API.
  3. 进行评估
const [nextPageLink] = await page.$x("//a[@rel='next']");
const nextPageURL = await nextPageLink.evaluate(link => link.href);

或者:

const [nextPageLink] = await page.$x("//a[@rel='next']");
const nextPageURL = await (await nextPageURL.getProperty('href')).jsonValue();