preg_replace 确实删除了一个组，但它应该在替换中使用它

Question

给定原始字符串：

<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...

我们假设 - 是 $str、

的值

和以下处理顺序：

$str=nl2br($str);

现在我们有：

<p>my text 1</p><br />
some other content<br />
<p>some other paragraph followed by an html line break</p><br><br />
etc...<br />

...，没关系。然后:

$str=preg_replace('/(<\/p>)<br.{0,2}\/>/',, $str);

我希望此代码删除所有 HTML  、  或   标记，这些标记紧跟在结束  之后。

php 是怎么给我的:

php > echo $str;
<p>my text 1
some other content<br />
<p>some other paragraphfollowed by an html line break</p><br><br />
etc...<br />
php >

?

我更希望：

<p>my text 1</p>
some other content<br />
<p>some other paragraph followed by an html line break</p><br>
etc...<br />

Answer 1

替换字符串中使用的反向引用格式错误，不应该是 </code>，而是 <code>''（引用！）。另外，<br.{0,2}\/> 你不包括   因为你强制使用一个斜杠。考虑到以上所有，这是一个解决方案：

$str = preg_replace('~(</p>)<br ?/?>~', '', $str);

Live demo

Answer 2

这将满足您的需求：

<?php

$text = '<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...';

$text = nl2br($text);

$regex= '#<\/p>(<br\s?\/?>)#';
$text = preg_replace($regex, '</p>', $text);
echo $text;

在此处查看正则表达式如何匹配https://regex101.com/r/0gPhL3/1

检查此处的代码https://3v4l.org/2RkFb

Answer 3

我觉得你说的是：

您想保留先前存在的   标签和
添加一个   标签，其中存在换行符但前面没有 html 标签（专门针对您的示例输入 - ）。

如果这是您编码意图的核心，那么您可以省略 nl2br() 步骤（以及随后的清理正则表达式调用），只针对以文本而不是标记结尾的行。

*如果这对您的实际项目不起作用，您必须调整或解释示例数据与实际数据之间的差异。

代码：(Demo) (Pattern Demo)

$string = <<<HTML
<p>my text 1</p>
some other content
<p>some other paragraph followed by an html line break</p><br>
etc...
HTML;

$string = preg_replace('~</?[a-z]+>\R(*SKIP)(*FAIL)|$~m', '<br>', $string);

var_export($string);                   // output
echo "\n----\n";
var_export(json_encode($string));      // encoded output (to show newline characters retained)

输出：

'<p>my text 1</p>
some other content<br>
<p>some other paragraph followed by an html line break</p><br>
etc...<br>'
----
'"<p>my text 1<\/p>\nsome other content<br>\n<p>some other paragraph followed by an html line break<\/p><br>\netc...<br>"'

本质上，我认为您可以更直接地完成此任务。这是模式细分：

~               #start of pattern delimiter
</?[a-z]+>      #match less than symbol, optional forward slash, one or more letters, greater than symbol
\R              #match newline character(s)  ...you can add match one or more if suitable for your project
(*SKIP)(*FAIL)  #discard the characters matched (disqualify the match / do not replace)
|               #or
$               #the end of a line
~               #end of pattern delimiter
m               #multiline pattern modifier, tells regex to treat $ as end of line not end of string

preg_replace 确实删除了一个组，但它应该在替换中使用它

preg_replace does remove a group whereas it should use it inside the replacement

php

regex

preg-replace