NodeJS - 如何抓取 ld+json 数据并将其保存到对象
NodeJS - how to scrape ld+json data and save it to an object
我一直在尝试找到一种获取 apllication/ld+json 内容并将其保存到本地对象的方法。我想要的是将它保存到一个对象中,在我的程序中我将能够 console.log(data.offers.availability) 这将导致日志记录:“InStock”,以及每个数据的日志记录值。
我目前有这个:
let content = JSON.stringify($("script[type='application/ld+json']").html())
let filteredJson = content.replace(/\n/g, '')
let results = JSON.parse(filteredJson)
console.log(results)
结果是:- 不允许我 console.log(results.offers.availability)
{ "@context": "http://schema.org/",
"@type": "Product", "name": "Apex Legends - Bangalore - Mini Epics",
"description": "<div class="textblock"><p><h2>Apex Legends - Bangalore - Mini Epics </h2><p>Helden uit alle uithoeken van de wereld strijden voor eer, roem en fortuin in Apex Legends. Weta Workshop betreedt the Wild Frontier en brengt Bangalore met zich mee - Mini Epics style!</p><p>Verzamel alle Apex Legends Mini Epics en voeg ook Bloodhound en Mirage toe aan je collectie!</p></p></div>",
"brand": {
"@type": "Thing",
"name": "Game Mania"
},
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "5",
"ratingCount": "2"
},
"offers": {
"@type": "Offer",
"priceCurrency": "EUR",
"price": "19.98",
"availability" : "InStock"
}
}
我正在尝试抓取并保存的数据:
正如 Bergi 指出的那样,问题是您在已经是字符串的内容上使用 JSON.stringify
,但出于好奇,我自己尝试了这个。考虑以下测试:
index.html(通过 localhost:4000 提供):
<html>
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Product",
"name": "Apex Legends - Bangalore - Mini Epics",
"offers": {
"@type": "Offer",
"priceCurrency": "EUR",
"price": "19.98",
"availability": "InStock"
}
}
</script>
<body>
<h2>Index</h2>
</body>
</html>
NodeJS 脚本:
const superagent = require('superagent');
const cheerio = require('cheerio');
(async () => {
const response = await superagent("http://localhost:4000");
const $ = cheerio.load(response.text);
// note that I'm not using .html(), although it works for me either way
const jsonRaw = $("script[type='application/ld+json']")[0].children[0].data;
// do not use JSON.stringify on the jsonRaw content, as it's already a string
const result = JSON.parse(jsonRaw);
console.log(result.offers.availability);
})()
result
现在是一个对象,它保存来自脚本标记和日志记录的数据 result.offers.availability
,将按预期打印 InStock
。
我一直在尝试找到一种获取 apllication/ld+json 内容并将其保存到本地对象的方法。我想要的是将它保存到一个对象中,在我的程序中我将能够 console.log(data.offers.availability) 这将导致日志记录:“InStock”,以及每个数据的日志记录值。
我目前有这个:
let content = JSON.stringify($("script[type='application/ld+json']").html())
let filteredJson = content.replace(/\n/g, '')
let results = JSON.parse(filteredJson)
console.log(results)
结果是:- 不允许我 console.log(results.offers.availability)
{ "@context": "http://schema.org/",
"@type": "Product", "name": "Apex Legends - Bangalore - Mini Epics",
"description": "<div class="textblock"><p><h2>Apex Legends - Bangalore - Mini Epics </h2><p>Helden uit alle uithoeken van de wereld strijden voor eer, roem en fortuin in Apex Legends. Weta Workshop betreedt the Wild Frontier en brengt Bangalore met zich mee - Mini Epics style!</p><p>Verzamel alle Apex Legends Mini Epics en voeg ook Bloodhound en Mirage toe aan je collectie!</p></p></div>",
"brand": {
"@type": "Thing",
"name": "Game Mania"
},
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "5",
"ratingCount": "2"
},
"offers": {
"@type": "Offer",
"priceCurrency": "EUR",
"price": "19.98",
"availability" : "InStock"
}
}
我正在尝试抓取并保存的数据:
正如 Bergi 指出的那样,问题是您在已经是字符串的内容上使用 JSON.stringify
,但出于好奇,我自己尝试了这个。考虑以下测试:
index.html(通过 localhost:4000 提供):
<html>
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Product",
"name": "Apex Legends - Bangalore - Mini Epics",
"offers": {
"@type": "Offer",
"priceCurrency": "EUR",
"price": "19.98",
"availability": "InStock"
}
}
</script>
<body>
<h2>Index</h2>
</body>
</html>
NodeJS 脚本:
const superagent = require('superagent');
const cheerio = require('cheerio');
(async () => {
const response = await superagent("http://localhost:4000");
const $ = cheerio.load(response.text);
// note that I'm not using .html(), although it works for me either way
const jsonRaw = $("script[type='application/ld+json']")[0].children[0].data;
// do not use JSON.stringify on the jsonRaw content, as it's already a string
const result = JSON.parse(jsonRaw);
console.log(result.offers.availability);
})()
result
现在是一个对象,它保存来自脚本标记和日志记录的数据 result.offers.availability
,将按预期打印 InStock
。