preg_replace 确实删除了一个组,但它应该在替换中使用它
preg_replace does remove a group whereas it should use it inside the replacement
给定原始字符串:
<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...
我们假设 - 是 $str
、
的值
和以下处理顺序:
$str=nl2br($str);
现在我们有:
<p>my text 1</p><br />
some other content<br />
<p>some other paragraph followed by an html line break</p><br><br />
etc...<br />
...,没关系。然后:
$str=preg_replace('/(<\/p>)<br.{0,2}\/>/',, $str);
我希望此代码删除所有 HTML <br />
、<br>
或 <br/>
标记,这些标记紧跟在结束 </p>
之后。
php 是怎么给我的:
php > echo $str;
<p>my text 1
some other content<br />
<p>some other paragraphfollowed by an html line break</p><br><br />
etc...<br />
php >
?
我更希望:
<p>my text 1</p>
some other content<br />
<p>some other paragraph followed by an html line break</p><br>
etc...<br />
替换字符串中使用的反向引用格式错误,不应该是 </code>,而是 <code>''
(引用!)。另外,<br.{0,2}\/>
你不包括 <br>
因为你强制使用一个斜杠。考虑到以上所有,这是一个解决方案:
$str = preg_replace('~(</p>)<br ?/?>~', '', $str);
这将满足您的需求:
<?php
$text = '<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...';
$text = nl2br($text);
$regex= '#<\/p>(<br\s?\/?>)#';
$text = preg_replace($regex, '</p>', $text);
echo $text;
在此处查看正则表达式如何匹配https://regex101.com/r/0gPhL3/1
检查此处的代码https://3v4l.org/2RkFb
我觉得你说的是:
- 您想保留先前存在的
<br>
标签和
- 添加一个
<br>
标签,其中存在换行符但前面没有 html 标签(专门针对您的示例输入 - </p>
)。
如果这是您编码意图的核心,那么您可以省略 nl2br()
步骤(以及随后的清理正则表达式调用),只针对以文本而不是标记结尾的行。
*如果这对您的实际项目不起作用,您必须调整或解释示例数据与实际数据之间的差异。
代码:(Demo) (Pattern Demo)
$string = <<<HTML
<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...
HTML;
$string = preg_replace('~</?[a-z]+>\R(*SKIP)(*FAIL)|$~m', '<br>', $string);
var_export($string); // output
echo "\n----\n";
var_export(json_encode($string)); // encoded output (to show newline characters retained)
输出:
'<p>my text 1</p>
some other content<br>
<p>some other paragraph followed by an html line break</p><br>
etc...<br>'
----
'"<p>my text 1<\/p>\nsome other content<br>\n<p>some other paragraph followed by an html line break<\/p><br>\netc...<br>"'
本质上,我认为您可以更直接地完成此任务。这是模式细分:
~ #start of pattern delimiter
</?[a-z]+> #match less than symbol, optional forward slash, one or more letters, greater than symbol
\R #match newline character(s) ...you can add match one or more if suitable for your project
(*SKIP)(*FAIL) #discard the characters matched (disqualify the match / do not replace)
| #or
$ #the end of a line
~ #end of pattern delimiter
m #multiline pattern modifier, tells regex to treat $ as end of line not end of string
给定原始字符串:
<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...
我们假设 - 是 $str
、
和以下处理顺序:
$str=nl2br($str);
现在我们有:
<p>my text 1</p><br />
some other content<br />
<p>some other paragraph followed by an html line break</p><br><br />
etc...<br />
...,没关系。然后:
$str=preg_replace('/(<\/p>)<br.{0,2}\/>/',, $str);
我希望此代码删除所有 HTML <br />
、<br>
或 <br/>
标记,这些标记紧跟在结束 </p>
之后。
php 是怎么给我的:
php > echo $str;
<p>my text 1
some other content<br />
<p>some other paragraphfollowed by an html line break</p><br><br />
etc...<br />
php >
?
我更希望:
<p>my text 1</p>
some other content<br />
<p>some other paragraph followed by an html line break</p><br>
etc...<br />
替换字符串中使用的反向引用格式错误,不应该是 </code>,而是 <code>''
(引用!)。另外,<br.{0,2}\/>
你不包括 <br>
因为你强制使用一个斜杠。考虑到以上所有,这是一个解决方案:
$str = preg_replace('~(</p>)<br ?/?>~', '', $str);
这将满足您的需求:
<?php
$text = '<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...';
$text = nl2br($text);
$regex= '#<\/p>(<br\s?\/?>)#';
$text = preg_replace($regex, '</p>', $text);
echo $text;
在此处查看正则表达式如何匹配https://regex101.com/r/0gPhL3/1
检查此处的代码https://3v4l.org/2RkFb
我觉得你说的是:
- 您想保留先前存在的
<br>
标签和 - 添加一个
<br>
标签,其中存在换行符但前面没有 html 标签(专门针对您的示例输入 -</p>
)。
如果这是您编码意图的核心,那么您可以省略 nl2br()
步骤(以及随后的清理正则表达式调用),只针对以文本而不是标记结尾的行。
*如果这对您的实际项目不起作用,您必须调整或解释示例数据与实际数据之间的差异。
代码:(Demo) (Pattern Demo)
$string = <<<HTML
<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...
HTML;
$string = preg_replace('~</?[a-z]+>\R(*SKIP)(*FAIL)|$~m', '<br>', $string);
var_export($string); // output
echo "\n----\n";
var_export(json_encode($string)); // encoded output (to show newline characters retained)
输出:
'<p>my text 1</p>
some other content<br>
<p>some other paragraph followed by an html line break</p><br>
etc...<br>'
----
'"<p>my text 1<\/p>\nsome other content<br>\n<p>some other paragraph followed by an html line break<\/p><br>\netc...<br>"'
本质上,我认为您可以更直接地完成此任务。这是模式细分:
~ #start of pattern delimiter
</?[a-z]+> #match less than symbol, optional forward slash, one or more letters, greater than symbol
\R #match newline character(s) ...you can add match one or more if suitable for your project
(*SKIP)(*FAIL) #discard the characters matched (disqualify the match / do not replace)
| #or
$ #the end of a line
~ #end of pattern delimiter
m #multiline pattern modifier, tells regex to treat $ as end of line not end of string