如何使用 JavaScript 或 Cheerio 从字符串中删除空的 <p> 标签?
How do I remove empty <p> tags from a string using JavaScript or Cheerio?
我有一些 HTML 作为字符串
"<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>"
如何使用 Cheerio 或 JS 从该字符串中删除空的 p 标签。
我尝试在 Stack Overflow 和 Google 上进行搜索,但总体上没有明确的工作解决方案。
编辑:抱歉,我刚刚注意到我的字符串在标签之间有很多白色 space:
这是我在我的应用程序中使用 console.log 时出现的示例:
<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p>
<p></p>
<p>Let's write another paragraph, and see how it renders when I read this post later. </p>
<p></p>
<p>Let's write another paragraph, and see how it renders when I read this post later. </p>
您可以将字符串 "<p></p>"
替换为空字符串 ""
var str = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
str = str.replace(/<p>\s*<\/p>/ig, '');
str = str.replace(/<p\s*\/>/ig, '');
console.log(str);
你可以试试这个:
let str = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
// If your <p> element has attribtues then also it will be replaced.
str = str.replace(/<p(\s+[a-z0-9\-_\'\"=]+)*><\/p>/ig, '');
console.log(str);
.as-console-wrapper {min-height: 100%!important; top: 0;}
您可以使用 replace
方法:
str = "<p>This is some HTML code</p>";
stripped = str.replace("<p>", "").replace("<\/p>", "");
console.log(stripped);
如果标签没有任何属性,您可以使用 .replace("<p></p>", "")
,但如果有,还有另一种方法(除了使用正则表达式捕获和替换标签之外)。
做事的一个好方法是使用本机 DOM 函数。
要删除空标签,可以使用以下选择器。
document.querySelectorAll("*:empty").forEach((x)=>{x.remove()});
在你的情况下可能是这样的
var div = document.createElement("div");
div.innerHTML = "<p>hello there</p><p class='empty'></p><p>Not empty</p><p></p>"//your variable containing HTML here;
div.querySelectorAll("*:empty").forEach((x)=>{x.remove()})
// Output: div.innerHTML == <p>hello there</p><p>Not empty</p>
//Then use remaining innerHTML as you wish
但请注意,:empty
不能像这样使用空格 <p> </p>
另请注意,:empty
将删除自闭标签
const regex = /<[^>]*>\s*<\/[^>]*>/;
const str = `<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now</p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>`;
let m;
if ((m = regex.exec(str)) !== null) {
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
试试这个
在空 <p>
标记的代码中,您有 \u200b(零宽度 space)字符。此角色保持隐形但存在
您可以使用 split()
和 join('')
方法
var test = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
var str = test.split('<p></p>').join('');
console.log(str);
或者你可以使用replace()
方法
var test = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
var str = test.replace(/<p><\/p>/gi, '');
console.log(str);
我有一些 HTML 作为字符串
"<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>"
如何使用 Cheerio 或 JS 从该字符串中删除空的 p 标签。
我尝试在 Stack Overflow 和 Google 上进行搜索,但总体上没有明确的工作解决方案。
编辑:抱歉,我刚刚注意到我的字符串在标签之间有很多白色 space:
这是我在我的应用程序中使用 console.log 时出现的示例:
<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p>
<p></p>
<p>Let's write another paragraph, and see how it renders when I read this post later. </p>
<p></p>
<p>Let's write another paragraph, and see how it renders when I read this post later. </p>
您可以将字符串 "<p></p>"
替换为空字符串 ""
var str = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
str = str.replace(/<p>\s*<\/p>/ig, '');
str = str.replace(/<p\s*\/>/ig, '');
console.log(str);
你可以试试这个:
let str = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
// If your <p> element has attribtues then also it will be replaced.
str = str.replace(/<p(\s+[a-z0-9\-_\'\"=]+)*><\/p>/ig, '');
console.log(str);
.as-console-wrapper {min-height: 100%!important; top: 0;}
您可以使用 replace
方法:
str = "<p>This is some HTML code</p>";
stripped = str.replace("<p>", "").replace("<\/p>", "");
console.log(stripped);
如果标签没有任何属性,您可以使用 .replace("<p></p>", "")
,但如果有,还有另一种方法(除了使用正则表达式捕获和替换标签之外)。
做事的一个好方法是使用本机 DOM 函数。
要删除空标签,可以使用以下选择器。
document.querySelectorAll("*:empty").forEach((x)=>{x.remove()});
在你的情况下可能是这样的
var div = document.createElement("div");
div.innerHTML = "<p>hello there</p><p class='empty'></p><p>Not empty</p><p></p>"//your variable containing HTML here;
div.querySelectorAll("*:empty").forEach((x)=>{x.remove()})
// Output: div.innerHTML == <p>hello there</p><p>Not empty</p>
//Then use remaining innerHTML as you wish
但请注意,:empty
不能像这样使用空格 <p> </p>
另请注意,:empty
将删除自闭标签
const regex = /<[^>]*>\s*<\/[^>]*>/;
const str = `<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now</p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>`;
let m;
if ((m = regex.exec(str)) !== null) {
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
试试这个
在空 <p>
标记的代码中,您有 \u200b(零宽度 space)字符。此角色保持隐形但存在
您可以使用 split()
和 join('')
方法
var test = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
var str = test.split('<p></p>').join('');
console.log(str);
或者你可以使用replace()
方法
var test = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";
var str = test.replace(/<p><\/p>/gi, '');
console.log(str);