使用 Puppeteer,如何获取 Chrome DevTools 的 "Network" 选项卡的时间信息?

Using Puppeteer, how to get Chrome DevTools' "Network" tab's timing information?

下面是我访问 https://www.ted.com 并检查 Google Chrome DevTools 的 "Network" 选项卡并查看根请求和子请求的计时数据的屏幕截图.

如何使用 Puppeteer 以编程方式获取上述所有计时数据?理想情况下,它看起来像这样 JSON 结构:

{
    name: "www.ted.com",
    queueTime: 0,
    startedTime: 1.93,
    stalledTime: 4.59,
    dnsLookupTime: 10.67,
    initialConnectionTime: <the number of milliseconds>,
    ...
},
{
    name: <the next child request>,
    ...
}

您想查看 HAR(HTTP 存档)文件,这是您将由 saving the data 通过 Chrome 创建的文件。

引用什么是HAR文件 (source):

The HAR file format is an evolving standard and the information contained within it is both flexible and extensible. You can expect a HAR file to include a breakdown of timings including:

  1. How long it takes to fetch DNS information
  2. How long each object takes to be requested
  3. How long it takes to connect to the server
  4. How long it takes to transfer assets from the server to the browser of each object

The data is stored as a JSON document and extracting meaning from the low-level data is not always easy. But with practice a HAR file can quickly help you identify the key performance problems with a web page, letting you efficiently target your development efforts at areas of your site that will deliver the greatest results.

puppeteer-har and chrome-har 这样的库可以使用 puppeteer 创建 HAR 文件。

代码示例(针对puppeteer-har,页面引用)

const har = new PuppeteerHar(page);
await har.start({ path: 'results.har' });

await page.goto('http://example.com');

await har.stop();

目前,您也可以在没有 HAR 文件的情况下获取此信息。

使用performance.getEntriesByType("resource")

// Obtain PerformanceEntry objects for resources    
const performanceTiming = JSON.parse(
      await page.evaluate(() =>
        JSON.stringify(performance.getEntriesByType("resource"))
      )
    );
    
// Optionally filter resource results to find your specifics - ex. filters on URL
const imageRequests = performanceTiming.filter((e) =>
  e.name.endsWith("/images")
);

console.log("Image Requests " , imageRequests)

HAR 文件是一个不错的选择,但如果想要更自定义的内容,您可以使用 Puppeteer 通过导航到您要分析的页面并点击 [=41] 来记录请求时间数据=] DevTools 协议。

(async function() {
  // launch in headless mode & create a new page
  const browser = await pptr.launch({
    headless: true,
  });
  const page = await browser.newPage();

  // attach cdp session to page
  const client = await page.target().createCDPSession();
  await client.send('Debugger.enable');
  await client.send('Debugger.setAsyncCallStackDepth', { maxDepth: 32 });

  // enable network
  await client.send('Network.enable');
  // attach callback to network response event
  await client.on('Network.responseReceived', (params) => {
    const { response: { timing } } = params;
    /*
     * See: https://chromedevtools.github.io/devtools-protocol
     * /tot/Network/#type-ResourceTiming for complete list of
     * timing data available under 'timing'
     */
  });

  await page.goto('https://www.ted.com/', {
    waitUntil: 'networkidle2',
  });

  // cleanup
  await browser.close();
})();

对于您的情况,您可以监听 Network.responseReceived 事件,并解析出 responseTime 参数,该参数嵌套在提供的响应对象的 response 属性 中在事件侦听器回调中。他们关于接口的文档非常好。我将在下面列出它们:


Chrome DevTools 协议文档

您可以预期从每个 Network.responseReceived 事件回调中收到的数据:Network.responseReceived

更具体的反应相关数据,在response属性:Network.Response.

最后,您要查找的嵌套请求计时数据,在 timing 下:Network.ResourceTiming


您可能还想查看 Network.requestWillBeSent 界面。您将能够通过 requestId.

匹配请求和响应

从这里,您可以获得比您所访问的页面更多的数据。您也可以根据需要格式化。