Puppeteer 的 `page.content()` 总是使用 UTF-8 或页面特定的字符集?
Puppeteer's `page.content()` always in UTF-8 or in page specific charset?
Puppeteer 的 page.content() return 字符串是否始终采用 UTF-8 或特定于页面的字符集?
我看到它在内部使用 document.documentElement.outerHTML
(see source code),但不确定它是如何工作的。
潜入 outerHTML
的 documentation:
Reading the value of outerHTML returns a DOMString containing an HTML
serialization of the element and its descendants. Setting the value of
outerHTML replaces the element and all of its descendants with a new
DOM tree constructed by parsing the specified htmlString.
深入 DOMString
的 documentation:
DOMString is a UTF-16 String. As JavaScript already uses such
strings, DOMString is mapped directly to a String.
看来谜团到此结束了。
Puppeteer 的 page.content() return 字符串是否始终采用 UTF-8 或特定于页面的字符集?
我看到它在内部使用 document.documentElement.outerHTML
(see source code),但不确定它是如何工作的。
潜入 outerHTML
的 documentation:
Reading the value of outerHTML returns a DOMString containing an HTML serialization of the element and its descendants. Setting the value of outerHTML replaces the element and all of its descendants with a new DOM tree constructed by parsing the specified htmlString.
深入 DOMString
的 documentation:
DOMString is a UTF-16 String. As JavaScript already uses such strings, DOMString is mapped directly to a String.
看来谜团到此结束了。