如果折扣大于给定值,则获得 link
Get a link if the discount is greater than the given value
我想从网站上获取所有折扣超过 20% 的产品链接。我设法获得了折扣值和一个数组以及第二个数组中的链接。但是我怎么也想不通它只会拉那些折扣大于20%的产品链接
const puppeteer = require('puppeteer')
const minDiscount = "20"
async function getLinks(){
const browser = await puppeteer.launch({headless: false, defaultViewport: null});
const page = await browser.newPage();
const url = 'https://www.chainreactioncycles.com/mtb/mountain-bikes'
await page.goto(url)
const discount = await page.$$eval('.savedamount .pixel_separator', (discount) => discount.map(discount => discount.innerText.replace(/[^0-9.]/g, '').replace(/\D+/g,'0')));
await page.waitForTimeout(2000);
if (discount >= minDiscount) {
const links = await page.$$eval('.description a', (allAs) => allAs.map((a) => a.href));
await page.waitForTimeout(2000);
console.log(links)
console.log(discount)
} else {
console.log("error")
}
}
这个问题是我的第一个问题 post。我刚刚开始学习基础知识,因此请寻求帮助。
我得到的结构:
'https://www.chainreactioncycles.com/nukeproof-scout-275-pro-bike-slx-2021/rp-prod196181',
'https://www.chainreactioncycles.com/nukeproof-scout-290-comp-bike-deore12-2021/rp-prod196185',
'https://www.chainreactioncycles.com/nukeproof-reactor-290-pro-alloy-bike-gx-eagle-2021/rp-prod196204',
'https://www.chainreactioncycles.com/cube-stereo-120-pro-suspension-bike-2022/rp-prod209176',
'https://www.chainreactioncycles.com/nukeproof-reactor-290-factory-carbon-bike-xt-2021/rp-prod196202',
'https://www.chainreactioncycles.com/gt-force-expert-suspension-bike-2021/rp-prod202172',
'https://www.chainreactioncycles.com/nukeproof-reactor-275-comp-alloy-bike-deore-2021/rp-prod196199',
'https://www.chainreactioncycles.com/ragley-blue-pig-hardtail-bike-2021/rp-prod197475',
'https://www.chainreactioncycles.com/gt-zaskar-lt-al-expert-hardtail-bike-2021/rp-prod197576',
'https://www.chainreactioncycles.com/nukeproof-scout-275-race-bike-deore10-2021/rp-prod196186',
'https://www.chainreactioncycles.com/gt-aggressor-comp-hardtail-bike-2021/rp-prod200304',
'https://www.chainreactioncycles.com/gt-zaskar-lt-al-elite-hardtail-bike-2021/rp-prod197577',
'https://www.chainreactioncycles.com/octane-one-melt-pump-track-bike-2021/rp-prod191780',
'https://www.chainreactioncycles.com/gt-avalanche-sport-hardtail-bike-2021/rp-prod200318',
'https://www.chainreactioncycles.com/gt-avalanche-comp-hardtail-bike-2021/rp-prod200319',
'https://www.chainreactioncycles.com/cube-aim-sl-29-hardtail-bike-2021/rp-prod200788',
'https://www.chainreactioncycles.com/cube-aim-pro-29-hardtail-bike-2021/rp-prod200664',
'https://www.chainreactioncycles.com/gt-aggressor-sport-hardtail-bike-2021/rp-prod200320',
'https://www.chainreactioncycles.com/nukeproof-mega-290-pro-alloy-bike-gx-eagle-2021/rp-prod196137',
'https://www.chainreactioncycles.com/vitus-sentier-29-vr-mountain-bike-2021/rp-prod195562',
'https://www.chainreactioncycles.com/nukeproof-mega-290-factory-carbon-bike-xt-2021/rp-prod196140',
'https://www.chainreactioncycles.com/cube-acid-29-hardtail-bike-2021/rp-prod200670',
'https://www.chainreactioncycles.com/nukeproof-giga-290-elite-carbon-bike-slx-2021/rp-prod196167',
'https://www.chainreactioncycles.com/vitus-nucleus-27-vrw-womens-mountain-bike-2021/rp-prod195569',
'https://www.chainreactioncycles.com/fuji-nevada-27-5-1-9-hardtail-bike-2022/rp-prod201686',
'https://www.chainreactioncycles.com/nukeproof-mega-290-comp-alloy-bike-deore-2021/rp-prod196147',
'https://www.chainreactioncycles.com/kona-lana-i-hardtail-bike-2022/rp-prod206856',
'https://www.chainreactioncycles.com/commencal-meta-tr-29-origin-suspension-bike-2021/rp-prod199874',
'https://www.chainreactioncycles.com/vitus-rapide-fs-crx-mountain-bike-2021/rp-prod198701',
'https://www.chainreactioncycles.com/nukeproof-dissent-297-rs-bike-x01-dh-2021/rp-prod196828',
'https://www.chainreactioncycles.com/nukeproof-giga-290-factory-carbon-bike-xt-2021/rp-prod196165',
'https://www.chainreactioncycles.com/vitus-sentier-27-vrs-mountain-bike-2021/rp-prod195560',
'https://www.chainreactioncycles.com/vitus-sentier-27-vr-mountain-bike-2021/rp-prod195592',
'https://www.chainreactioncycles.com/cube-gear-hanger-agree-c62-sl-nuroad-race/rp-prod206882',
'https://www.chainreactioncycles.com/gt-sensor-carbon-elite-suspension-bike-2021/rp-prod202174',
'https://www.chainreactioncycles.com/vitus-nucleus-27-vr-mountain-bike-blue-2021/rp-prod195556',
'https://www.chainreactioncycles.com/vitus-mythique-27-vr-mountain-bike-2021/rp-prod195583',
'https://www.chainreactioncycles.com/nukeproof-mega-290-elite-carbon-bike-slx-2021/rp-prod196138',
'https://www.chainreactioncycles.com/commencal-meta-ht-am-origin-27-5-hardtail-bike-2021/rp-prod199876',
'https://www.chainreactioncycles.com/fuji-nevada-29-1-9-hardtail-bike-2022/rp-prod201656',
'https://www.chainreactioncycles.com/vitus-sentier-29-vrs-mountain-bike-2021/rp-prod195597',
'https://www.chainreactioncycles.com/ghost-lanao-base-27-5-hardtail-bike-2021/rp-prod201636',
'https://www.chainreactioncycles.com/ragley-marley-1-0-hardtail-bike-2021/rp-prod197469',
'https://www.chainreactioncycles.com/commencal-meta-am-29-ohlins-suspension-bike-2021/rp-prod199946',
'https://www.chainreactioncycles.com/kona-lava-dome-hardtail-bike-2022/rp-prod206858',
'https://www.chainreactioncycles.com/kona-kahuna-hardtail-bike-2022/rp-prod206862',
'https://www.chainreactioncycles.com/cube-aim-pro-27-5-hardtail-bike-2021/rp-prod200657',
'https://www.chainreactioncycles.com/ns-bikes-clash-dirt-jump-bike-2021/rp-prod197914'
]
[ '20', '95', '14', '15', '10', '10' ]
我没有测试下面给出的完整代码,您可能需要进行调整
由于我不知道你的编码背景是什么,所以我会尽可能地广泛。请记住,我从未使用 Puppeteer 开发过任何东西,而且我也没有 30 多年的经验。在 JS 中,所以我要写的可能不是最好的优化。
首先,我将 minDiscount 设为 Int 而不是字符串,以正确比较两个数字。
const minDiscount = 20;
然后,我不会在两个单独的数组中获得折扣和 link,因为正如您已经想到的那样,您无法将两者关联起来。相反,只需获取所有产品:
const products = await page.$$('.products_details_container');
根据 Puppeteer 文档,您将获得 ElementHandle 的数组。从中尝试获取每个元素的折扣(如果存在)。如果存在,将其解析为 Int 并与您的 minDiscount
进行比较。如果>=20,最后得到这个商品的link,和折扣一起推送到一个最终的数组中。
完整代码,注释:
async function getLinks() {
const browser = await puppeteer.launch({
headless: true,
defaultViewport: null,
});
const page = await browser.newPage();
const url = 'https://www.chainreactioncycles.com/mtb/mountain-bikes';
await page.goto(url);
// getting all the products, this will return an array of ElementHandle
const products = await page.$$('.products_details_container');
const proms = await Promise.allSettled(
products.map(async (prod) => {
// searching for a discount on each product
const disc = await prod.$$eval(
'.savedamount .pixel_separator',
(discount) =>
discount.map((discItem) =>
discItem.innerText.replace(/[^0-9.]/g, '').replace(/\D+/g, '0')
)
);
// if it has a discount
if (disc.length > 0) {
// we parse the discount to Integer type to compare it to minDiscount
const discountInt = parseInt(disc[0], 10);
if (discountInt >= minDiscount) {
// we get the link of the product
const link = await prod.$$eval('.description a', (allAs) => allAs.map((a) => a.href));
if (link.length > 0) {
// if all went well, we push an object containing the discount and the link of the product
return { discount: discountInt, link: link[0] };
}
}
}
return null;
})
);
const bulkArray = proms.map((item) => {
if (item.status === 'fulfilled') return item.value;
});
const endArray = bulkArray.filter(item => item !== null);
console.log(endArray);
}
getLinks();
如前所述,它既不是最优的也不是优化的,但我会把这个任务留给你。
我想从网站上获取所有折扣超过 20% 的产品链接。我设法获得了折扣值和一个数组以及第二个数组中的链接。但是我怎么也想不通它只会拉那些折扣大于20%的产品链接
const puppeteer = require('puppeteer')
const minDiscount = "20"
async function getLinks(){
const browser = await puppeteer.launch({headless: false, defaultViewport: null});
const page = await browser.newPage();
const url = 'https://www.chainreactioncycles.com/mtb/mountain-bikes'
await page.goto(url)
const discount = await page.$$eval('.savedamount .pixel_separator', (discount) => discount.map(discount => discount.innerText.replace(/[^0-9.]/g, '').replace(/\D+/g,'0')));
await page.waitForTimeout(2000);
if (discount >= minDiscount) {
const links = await page.$$eval('.description a', (allAs) => allAs.map((a) => a.href));
await page.waitForTimeout(2000);
console.log(links)
console.log(discount)
} else {
console.log("error")
}
}
这个问题是我的第一个问题 post。我刚刚开始学习基础知识,因此请寻求帮助。
我得到的结构:
'https://www.chainreactioncycles.com/nukeproof-scout-275-pro-bike-slx-2021/rp-prod196181',
'https://www.chainreactioncycles.com/nukeproof-scout-290-comp-bike-deore12-2021/rp-prod196185',
'https://www.chainreactioncycles.com/nukeproof-reactor-290-pro-alloy-bike-gx-eagle-2021/rp-prod196204',
'https://www.chainreactioncycles.com/cube-stereo-120-pro-suspension-bike-2022/rp-prod209176',
'https://www.chainreactioncycles.com/nukeproof-reactor-290-factory-carbon-bike-xt-2021/rp-prod196202',
'https://www.chainreactioncycles.com/gt-force-expert-suspension-bike-2021/rp-prod202172',
'https://www.chainreactioncycles.com/nukeproof-reactor-275-comp-alloy-bike-deore-2021/rp-prod196199',
'https://www.chainreactioncycles.com/ragley-blue-pig-hardtail-bike-2021/rp-prod197475',
'https://www.chainreactioncycles.com/gt-zaskar-lt-al-expert-hardtail-bike-2021/rp-prod197576',
'https://www.chainreactioncycles.com/nukeproof-scout-275-race-bike-deore10-2021/rp-prod196186',
'https://www.chainreactioncycles.com/gt-aggressor-comp-hardtail-bike-2021/rp-prod200304',
'https://www.chainreactioncycles.com/gt-zaskar-lt-al-elite-hardtail-bike-2021/rp-prod197577',
'https://www.chainreactioncycles.com/octane-one-melt-pump-track-bike-2021/rp-prod191780',
'https://www.chainreactioncycles.com/gt-avalanche-sport-hardtail-bike-2021/rp-prod200318',
'https://www.chainreactioncycles.com/gt-avalanche-comp-hardtail-bike-2021/rp-prod200319',
'https://www.chainreactioncycles.com/cube-aim-sl-29-hardtail-bike-2021/rp-prod200788',
'https://www.chainreactioncycles.com/cube-aim-pro-29-hardtail-bike-2021/rp-prod200664',
'https://www.chainreactioncycles.com/gt-aggressor-sport-hardtail-bike-2021/rp-prod200320',
'https://www.chainreactioncycles.com/nukeproof-mega-290-pro-alloy-bike-gx-eagle-2021/rp-prod196137',
'https://www.chainreactioncycles.com/vitus-sentier-29-vr-mountain-bike-2021/rp-prod195562',
'https://www.chainreactioncycles.com/nukeproof-mega-290-factory-carbon-bike-xt-2021/rp-prod196140',
'https://www.chainreactioncycles.com/cube-acid-29-hardtail-bike-2021/rp-prod200670',
'https://www.chainreactioncycles.com/nukeproof-giga-290-elite-carbon-bike-slx-2021/rp-prod196167',
'https://www.chainreactioncycles.com/vitus-nucleus-27-vrw-womens-mountain-bike-2021/rp-prod195569',
'https://www.chainreactioncycles.com/fuji-nevada-27-5-1-9-hardtail-bike-2022/rp-prod201686',
'https://www.chainreactioncycles.com/nukeproof-mega-290-comp-alloy-bike-deore-2021/rp-prod196147',
'https://www.chainreactioncycles.com/kona-lana-i-hardtail-bike-2022/rp-prod206856',
'https://www.chainreactioncycles.com/commencal-meta-tr-29-origin-suspension-bike-2021/rp-prod199874',
'https://www.chainreactioncycles.com/vitus-rapide-fs-crx-mountain-bike-2021/rp-prod198701',
'https://www.chainreactioncycles.com/nukeproof-dissent-297-rs-bike-x01-dh-2021/rp-prod196828',
'https://www.chainreactioncycles.com/nukeproof-giga-290-factory-carbon-bike-xt-2021/rp-prod196165',
'https://www.chainreactioncycles.com/vitus-sentier-27-vrs-mountain-bike-2021/rp-prod195560',
'https://www.chainreactioncycles.com/vitus-sentier-27-vr-mountain-bike-2021/rp-prod195592',
'https://www.chainreactioncycles.com/cube-gear-hanger-agree-c62-sl-nuroad-race/rp-prod206882',
'https://www.chainreactioncycles.com/gt-sensor-carbon-elite-suspension-bike-2021/rp-prod202174',
'https://www.chainreactioncycles.com/vitus-nucleus-27-vr-mountain-bike-blue-2021/rp-prod195556',
'https://www.chainreactioncycles.com/vitus-mythique-27-vr-mountain-bike-2021/rp-prod195583',
'https://www.chainreactioncycles.com/nukeproof-mega-290-elite-carbon-bike-slx-2021/rp-prod196138',
'https://www.chainreactioncycles.com/commencal-meta-ht-am-origin-27-5-hardtail-bike-2021/rp-prod199876',
'https://www.chainreactioncycles.com/fuji-nevada-29-1-9-hardtail-bike-2022/rp-prod201656',
'https://www.chainreactioncycles.com/vitus-sentier-29-vrs-mountain-bike-2021/rp-prod195597',
'https://www.chainreactioncycles.com/ghost-lanao-base-27-5-hardtail-bike-2021/rp-prod201636',
'https://www.chainreactioncycles.com/ragley-marley-1-0-hardtail-bike-2021/rp-prod197469',
'https://www.chainreactioncycles.com/commencal-meta-am-29-ohlins-suspension-bike-2021/rp-prod199946',
'https://www.chainreactioncycles.com/kona-lava-dome-hardtail-bike-2022/rp-prod206858',
'https://www.chainreactioncycles.com/kona-kahuna-hardtail-bike-2022/rp-prod206862',
'https://www.chainreactioncycles.com/cube-aim-pro-27-5-hardtail-bike-2021/rp-prod200657',
'https://www.chainreactioncycles.com/ns-bikes-clash-dirt-jump-bike-2021/rp-prod197914'
]
[ '20', '95', '14', '15', '10', '10' ]
我没有测试下面给出的完整代码,您可能需要进行调整
由于我不知道你的编码背景是什么,所以我会尽可能地广泛。请记住,我从未使用 Puppeteer 开发过任何东西,而且我也没有 30 多年的经验。在 JS 中,所以我要写的可能不是最好的优化。
首先,我将 minDiscount 设为 Int 而不是字符串,以正确比较两个数字。
const minDiscount = 20;
然后,我不会在两个单独的数组中获得折扣和 link,因为正如您已经想到的那样,您无法将两者关联起来。相反,只需获取所有产品:
const products = await page.$$('.products_details_container');
根据 Puppeteer 文档,您将获得 ElementHandle 的数组。从中尝试获取每个元素的折扣(如果存在)。如果存在,将其解析为 Int 并与您的 minDiscount
进行比较。如果>=20,最后得到这个商品的link,和折扣一起推送到一个最终的数组中。
完整代码,注释:
async function getLinks() {
const browser = await puppeteer.launch({
headless: true,
defaultViewport: null,
});
const page = await browser.newPage();
const url = 'https://www.chainreactioncycles.com/mtb/mountain-bikes';
await page.goto(url);
// getting all the products, this will return an array of ElementHandle
const products = await page.$$('.products_details_container');
const proms = await Promise.allSettled(
products.map(async (prod) => {
// searching for a discount on each product
const disc = await prod.$$eval(
'.savedamount .pixel_separator',
(discount) =>
discount.map((discItem) =>
discItem.innerText.replace(/[^0-9.]/g, '').replace(/\D+/g, '0')
)
);
// if it has a discount
if (disc.length > 0) {
// we parse the discount to Integer type to compare it to minDiscount
const discountInt = parseInt(disc[0], 10);
if (discountInt >= minDiscount) {
// we get the link of the product
const link = await prod.$$eval('.description a', (allAs) => allAs.map((a) => a.href));
if (link.length > 0) {
// if all went well, we push an object containing the discount and the link of the product
return { discount: discountInt, link: link[0] };
}
}
}
return null;
})
);
const bulkArray = proms.map((item) => {
if (item.status === 'fulfilled') return item.value;
});
const endArray = bulkArray.filter(item => item !== null);
console.log(endArray);
}
getLinks();
如前所述,它既不是最优的也不是优化的,但我会把这个任务留给你。