Puppeteer $.eval 选择嵌套元素
Puppeteer $.eval selecting nested elements
假设我给出的情况类似于此页面
<div id="details-container" class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="subheadline style-scope ytd-channel-about-metadata-renderer">Details</yt-formatted-string>
<table class="style-scope ytd-channel-about-metadata-renderer">
<tbody class="style-scope ytd-channel-about-metadata-renderer"><tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<ytd-button-renderer align-by-text="" class="style-scope ytd-channel-about-metadata-renderer" button-renderer=""></ytd-button-renderer>
<div id="captcha-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<div id="email-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<a id="email" target="_blank" class="style-scope ytd-channel-about-metadata-renderer" href="mailto:undefined" hidden=""></a>
</td>
</tr>
<tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"><span class="deemphasize style-scope yt-formatted-string"> Location: </span></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer">YourCountry</yt-formatted-string>
</td>
</tr>
</tbody></table>
</div>
假设我需要获取 "YourCountry" 我实际上如何获取该元素?
到目前为止我试过:
const location = await page.$$eval(
"#details-container > table > tbody:nth-child(1) > tr:nth-child(1) > yt-formatted-string",
locationEl => locationEl.innerHTML
);
console.log(location) // Undefined
不知道该怎么做,尝试 return 只是 tr 然后再次评估 tr[1] 不起作用,因为它说 tr 没有函数 .$$eval。
请注意,我正在使用 apify 获取页面。
在您提供的 HTML 中,您想要的 yt-formatted-string
元素是第二个 td
下第二个 tr
的直接子元素,但您尝试来匹配它 yt-formatted-string
,它是第二个 tr
的直接子代。您需要修复选择器。例如:
console.log("HTML:", document.querySelector("#details-container > table > tbody > tr:nth-child(2) > td:nth-child(2) > yt-formatted-string").innerHTML)
<div id="details-container" class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="subheadline style-scope ytd-channel-about-metadata-renderer">Details</yt-formatted-string>
<table class="style-scope ytd-channel-about-metadata-renderer">
<tbody class="style-scope ytd-channel-about-metadata-renderer">
<tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<ytd-button-renderer align-by-text="" class="style-scope ytd-channel-about-metadata-renderer" button-renderer=""></ytd-button-renderer>
<div id="captcha-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<div id="email-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<a id="email" target="_blank" class="style-scope ytd-channel-about-metadata-renderer" href="mailto:undefined" hidden=""></a>
</td>
</tr>
<tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"><span class="deemphasize style-scope yt-formatted-string"> Location: </span></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer">YourCountry</yt-formatted-string>
</td>
</tr>
</tbody>
</table>
</div>
和you should be able to call $$eval
&c. if you have an ElementHandle
。问题是你的选择器不匹配,所以你没有。
我更喜欢使用 jQuery。这是查询元素的最佳方式。
例如,您可以从 Apify utils 注入 jQuery。
const { puppeteer } = Apify.utils;
await puppeteer.injectJQuery(page);
const location = await page. evaluate(() => {
return $('#details-container yt-formatted-string').last().text();
});
console.log(location);
假设我给出的情况类似于此页面
<div id="details-container" class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="subheadline style-scope ytd-channel-about-metadata-renderer">Details</yt-formatted-string>
<table class="style-scope ytd-channel-about-metadata-renderer">
<tbody class="style-scope ytd-channel-about-metadata-renderer"><tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<ytd-button-renderer align-by-text="" class="style-scope ytd-channel-about-metadata-renderer" button-renderer=""></ytd-button-renderer>
<div id="captcha-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<div id="email-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<a id="email" target="_blank" class="style-scope ytd-channel-about-metadata-renderer" href="mailto:undefined" hidden=""></a>
</td>
</tr>
<tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"><span class="deemphasize style-scope yt-formatted-string"> Location: </span></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer">YourCountry</yt-formatted-string>
</td>
</tr>
</tbody></table>
</div>
假设我需要获取 "YourCountry" 我实际上如何获取该元素?
到目前为止我试过:
const location = await page.$$eval(
"#details-container > table > tbody:nth-child(1) > tr:nth-child(1) > yt-formatted-string",
locationEl => locationEl.innerHTML
);
console.log(location) // Undefined
不知道该怎么做,尝试 return 只是 tr 然后再次评估 tr[1] 不起作用,因为它说 tr 没有函数 .$$eval。
请注意,我正在使用 apify 获取页面。
在您提供的 HTML 中,您想要的 yt-formatted-string
元素是第二个 td
下第二个 tr
的直接子元素,但您尝试来匹配它 yt-formatted-string
,它是第二个 tr
的直接子代。您需要修复选择器。例如:
console.log("HTML:", document.querySelector("#details-container > table > tbody > tr:nth-child(2) > td:nth-child(2) > yt-formatted-string").innerHTML)
<div id="details-container" class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="subheadline style-scope ytd-channel-about-metadata-renderer">Details</yt-formatted-string>
<table class="style-scope ytd-channel-about-metadata-renderer">
<tbody class="style-scope ytd-channel-about-metadata-renderer">
<tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<ytd-button-renderer align-by-text="" class="style-scope ytd-channel-about-metadata-renderer" button-renderer=""></ytd-button-renderer>
<div id="captcha-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<div id="email-container" class="style-scope ytd-channel-about-metadata-renderer"></div>
<a id="email" target="_blank" class="style-scope ytd-channel-about-metadata-renderer" href="mailto:undefined" hidden=""></a>
</td>
</tr>
<tr class="style-scope ytd-channel-about-metadata-renderer">
<td class="label style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer"><span class="deemphasize style-scope yt-formatted-string"> Location: </span></yt-formatted-string>
</td>
<td class="style-scope ytd-channel-about-metadata-renderer">
<yt-formatted-string class="style-scope ytd-channel-about-metadata-renderer">YourCountry</yt-formatted-string>
</td>
</tr>
</tbody>
</table>
</div>
和you should be able to call $$eval
&c. if you have an ElementHandle
。问题是你的选择器不匹配,所以你没有。
我更喜欢使用 jQuery。这是查询元素的最佳方式。 例如,您可以从 Apify utils 注入 jQuery。
const { puppeteer } = Apify.utils;
await puppeteer.injectJQuery(page);
const location = await page. evaluate(() => {
return $('#details-container yt-formatted-string').last().text();
});
console.log(location);