preg_replace 确实删除了一个组,但它应该在替换中使用它

preg_replace does remove a group whereas it should use it inside the replacement

给定原始字符串:

<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...

我们假设 - 是 $str

的值

和以下处理顺序:

$str=nl2br($str);

现在我们有:

<p>my text 1</p><br />
some other content<br />
<p>some other paragraph followed by an html line break</p><br><br />
etc...<br />

...,没关系。然后:

$str=preg_replace('/(<\/p>)<br.{0,2}\/>/',, $str);

我希望此代码删除所有 HTML <br /><br><br/> 标记,这些标记紧跟在结束 </p> 之后。

php 是怎么给我的:

php > echo $str;
<p>my text 1
some other content<br />
<p>some other paragraphfollowed by an html line break</p><br><br />
etc...<br />
php > 

?

我更希望:

<p>my text 1</p>
some other content<br />
<p>some other paragraph followed by an html line break</p><br>
etc...<br />

替换字符串中使用的反向引用格式错误,不应该是 </code>,而是 <code>''(引用!)。另外,<br.{0,2}\/> 你不包括 <br> 因为你强制使用一个斜杠。考虑到以上所有,这是一个解决方案:

$str = preg_replace('~(</p>)<br ?/?>~', '', $str);

Live demo

这将满足您的需求:

<?php

$text = '<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...';

$text = nl2br($text);

$regex= '#<\/p>(<br\s?\/?>)#';
$text = preg_replace($regex, '</p>', $text);
echo $text;

在此处查看正则表达式如何匹配https://regex101.com/r/0gPhL3/1

检查此处的代码https://3v4l.org/2RkFb

我觉得你说的是:

  1. 您想保留先前存在的 <br> 标签和
  2. 添加一个 <br> 标签,其中存在换行符但前面没有 html 标签(专门针对您的示例输入 - </p>)。

如果这是您编码意图的核心,那么您可以省略 nl2br() 步骤(以及随后的清理正则表达式调用),只针对以文本而不是标记结尾的行。

*如果这对您的实际项目不起作用,您必须调整或解释示例数据与实际数据之间的差异。

代码:(Demo) (Pattern Demo)

$string = <<<HTML
<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...
HTML;

$string = preg_replace('~</?[a-z]+>\R(*SKIP)(*FAIL)|$~m', '<br>', $string);

var_export($string);                   // output
echo "\n----\n";
var_export(json_encode($string));      // encoded output (to show newline characters retained)

输出:

'<p>my text 1</p>
some other content<br>
<p>some other paragraph followed by an html line break</p><br>
etc...<br>'
----
'"<p>my text 1<\/p>\nsome other content<br>\n<p>some other paragraph followed by an html line break<\/p><br>\netc...<br>"'

本质上,我认为您可以更直接地完成此任务。这是模式细分:

~               #start of pattern delimiter
</?[a-z]+>      #match less than symbol, optional forward slash, one or more letters, greater than symbol
\R              #match newline character(s)  ...you can add match one or more if suitable for your project
(*SKIP)(*FAIL)  #discard the characters matched (disqualify the match / do not replace)
|               #or
$               #the end of a line
~               #end of pattern delimiter
m               #multiline pattern modifier, tells regex to treat $ as end of line not end of string