当使用客户端函数填充 DOM 时,如何等待从 puppeteer 中的 page.evaluate 函数加载所有图像

How to wait for all images to load from page.evaluate function in puppeteer when the DOM is populated using a client side function

我试图让代码执行等待所有图像加载,然后 puppeteer 截取屏幕截图。当调用 initData() 函数时,我的 DOM 被填充,该函数在客户端 js 文件中定义。延迟或超时是一种选择,但我相信一定有更有效的方法。

    (async (dataObj) => {
             const url = dataObj.url;
             const payload = dataObj.payload;
             const browser = await puppeteer.launch({ headless: false,devtools:false});
             const page = await browser.newPage();
             await page.goto(url,{'waitUntil': 'networkidle0'});

             await page.evaluate((payload) => {
               initData(payload);
                //initData is a client side function that populates the DOM, need to wait 
                //here till the images are loaded. 
               },payload)

             await page.setViewport({ width: 1280, height: 720 })
             await page.screenshot({ path: 'test.png' });
             await browser.close();
    })(dataObj)

提前致谢。

你可以使用 promises 来做到这一点,通过获取文档中的所有 <img> 标签并循环检查直到浏览器获取所有这些标签(当所有 imgs img.complete == true 时)然后你 resolve the promise .

HTMLImageElement.complete Read only

Returns a Boolean that is true if the browser has finished fetching the image, whether successful or not. It also shows true, if the image has no src value.

Ref.: MDN HTMLImageElement

我已经为此实现了一个功能,returns 承诺在获取所有 img 时解析并拒绝以防超时(最初为 30 秒,但可以更改)。

用法:

// consuming the promise
imgReady().then(
  (imgs) => {
    // do stuff here
    console.log('imgs ready');
  },
  (err) => {
    console.log('imgs taking to long to load');
  }
);

// inside asyng functions
const imgs = await imgReady();

Note on window.onload: You can use window.onload too; however, window.onload waits for everything to load instead of only images.

/**
 * @param timeout: how long to wait until reject and cancel the execution.
 * @param tickrate: how long to recheck all imgs again.
 *
 * @returns
 *   A promise which resolve when all img on document gets fetched.
 *   The promise get rejected if it reach the @timeout time to execute.
 */
function imgReady(timeout = 30*1000, tickrate = 10) {
  const imgs = Array.from(document.getElementsByTagName('img'));
  const t0 = new Date().getTime();

  return new Promise((resolve, reject) => {

    const checkImg = () => {
      const t1 = new Date().getTime();

      if (t1 - t0 > timeout) {
        reject({
          message: `CheckImgReadyTimeoutException: imgs taking to loong to load.`
        });
      }

      if (imgs.every(x => x.complete)) {
        resolve(imgs);
      } else {
        setTimeout(checkImg, tickrate);
      }
    };

    checkImg();
  });
}

imgReady().then(console.log,console.error);
img{max-width: 100px;}
<img src="https://upload.wikimedia.org/wikipedia/commons/c/cc/ESC_large_ISS022_ISS022-E-11387-edit_01.JPG">
<br>
<img src="https://www.publicdomainpictures.net/pictures/90000/velka/planet-earth-1401465698wt7.jpg">

,图像元素有一个complete属性。您可以编写一个函数,当文档中的所有图像都已获取时 returns 为真:

function imagesHaveLoaded() { return Array.from(document.images).every((i) => i.complete); }

您可以像这样等待该功能:

await page.waitForFunction(imagesHaveLoaded);

将两者与您的原始代码放在一起并添加超时,这样它就不会无限期地等待,我们得到:

function imagesHaveLoaded() {
    return Array.from(document.images).every((i) => i.complete);
}

(async (dataObj) => {
         const url = dataObj.url;
         const payload = dataObj.payload;
         const browser = await puppeteer.launch({ headless: false, devtools: false});
         const page = await browser.newPage();
         await page.goto(url, { waitUntil: 'networkidle0' });

         await page.evaluate((payload) => {
           initData(payload);
         }, payload);

         await page.waitForFunction(imagesHaveLoaded, { timeout: YOUR_DESIRED_TIMEOUT });

         await page.setViewport({ width: 1280, height: 720 })
         await page.screenshot({ path: 'test.png' });
         await browser.close();
})(dataObj)