如何使用 puppeteer 访问 iframe #document?
How to access the iframe #document using puppeteer?
我正在尝试抓取动漫视频页面 [jkanime],但我遇到了 mp4 视频格式问题,因为它们位于 iframe #document 中。
在 chrome 开发工具中,我输入了以下内容: $('#jkvideo_html5_api source').src
并且 mp4 的 src 显示给我。但我不知道如何应用查询 *$('#jkvideo_html5_api source').src
* 与木偶师。
现在...我想实现的是如何获取_navigationURL的值,然后请求并引用mp4视频源。
任何帮助将不胜感激!
图片
devtool source code section
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const $ = cheerio.load(`${frame}`);
console.log(frame)
}
获得的部分输出
....
OMWorld {
_frameManager:
FrameManager {
_events: [Object],
_eventsCount: 3,
_maxListeners: undefined,
_client: [CDPSession],
_page: [Page],
_networkManager: [NetworkManager],
_timeoutSettings: [TimeoutSettings],
_frames: [Map],
_contextIdToContext: [Map],
_isolatedWorlds: [Set],
_mainFrame: [Frame] },
_frame: [Circular],
_timeoutSettings:
TimeoutSettings { _defaultTimeout: null, _defaultNavigationTimeout: null }, _documentPromise: null,
_contextResolveCallback: null,
_contextPromise: Promise { [ExecutionContext] },
_waitTasks: Set {},
_detached: false },
_childFrames: Set {},
_name: '',
_navigationURL:
'https://jkanime.net/um.php?e=Q0VxeUQ2MmZRRlNWeUdHKzdoWlJQOGFLNjFRUnljVkFTaEtFMElZUjFmTlRPQnhnUUtqbnRodjhEVHlGYnVleWJsdnNnRy9wNzVLd0MrMURuRVBKV0tQZjVuT0tIblc3cUNmZDNzdFVFaEE9OjrIf8cc_60GOGTTN7Th9Q_a' }
我想要获得的输出
{
"src": [
"https://storage.googleapis.com/markesito.appspot.com/tokgho/01.mp4"
]
}
问题解决:11:34am
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const video = await frame.$eval('#jkvideo_html5_api', el =>
Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
return video;
}
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const video = await frame.$eval('#jkvideo_html5_api', el =>
Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
return video;
}
我正在尝试抓取动漫视频页面 [jkanime],但我遇到了 mp4 视频格式问题,因为它们位于 iframe #document 中。
在 chrome 开发工具中,我输入了以下内容: $('#jkvideo_html5_api source').src
并且 mp4 的 src 显示给我。但我不知道如何应用查询 *$('#jkvideo_html5_api source').src * 与木偶师。
现在...我想实现的是如何获取_navigationURL的值,然后请求并引用mp4视频源。
任何帮助将不胜感激!
图片
devtool source code section
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const $ = cheerio.load(`${frame}`);
console.log(frame)
}
获得的部分输出
....
OMWorld {
_frameManager:
FrameManager {
_events: [Object],
_eventsCount: 3,
_maxListeners: undefined,
_client: [CDPSession],
_page: [Page],
_networkManager: [NetworkManager],
_timeoutSettings: [TimeoutSettings],
_frames: [Map],
_contextIdToContext: [Map],
_isolatedWorlds: [Set],
_mainFrame: [Frame] },
_frame: [Circular],
_timeoutSettings:
TimeoutSettings { _defaultTimeout: null, _defaultNavigationTimeout: null }, _documentPromise: null,
_contextResolveCallback: null,
_contextPromise: Promise { [ExecutionContext] },
_waitTasks: Set {},
_detached: false },
_childFrames: Set {},
_name: '',
_navigationURL:
'https://jkanime.net/um.php?e=Q0VxeUQ2MmZRRlNWeUdHKzdoWlJQOGFLNjFRUnljVkFTaEtFMElZUjFmTlRPQnhnUUtqbnRodjhEVHlGYnVleWJsdnNnRy9wNzVLd0MrMURuRVBKV0tQZjVuT0tIblc3cUNmZDNzdFVFaEE9OjrIf8cc_60GOGTTN7Th9Q_a' }
我想要获得的输出
{
"src": [
"https://storage.googleapis.com/markesito.appspot.com/tokgho/01.mp4"
]
}
问题解决:11:34am
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const video = await frame.$eval('#jkvideo_html5_api', el =>
Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
return video;
}
const getAnimeVideo = async (id: string, chapter: number) => {
const BASE_URL = `${url}${id}/${chapter}/` // => https://jkanime.net/tokyo-ghoul/1/
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(BASE_URL);
const elementHandle = await page.$('.player_conte')
const frame = await elementHandle.contentFrame();
const video = await frame.$eval('#jkvideo_html5_api', el =>
Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
return video;
}