PHP preg_replace html 标签包含换行符

Question

我正在尝试使用 preg_replace 删除某个 html 标签，但我找不到任何方法，如果我删除换行符但不删除换行符，它会起作用。

到目前为止的正则表达式：

preg_replace("/<ol class=\"comment-list\">.*?<\/ol>/", "", $string);

有问题的字符串：

<ol class="comment-list">
<time datetime="2016-03-25T15:27:34+00:00"></ol>

我正在使用 http://www.phpliveregex.com/ 进行测试。

非常感谢您的帮助！

Answer 1

我知道这个答案可能不是你想要的，但如果你想试试，这就是你如何使用 DOMDocument 删除 <ol> 节点：

$dom = new DOMDocument();           // Init DOMDocument object
libxml_use_internal_errors( True ); // Disable libxml errors
$dom->loadHTML( $html );            // Load HTML
$xpath = new DOMXPath( $dom );      // Init DOMXPath (useful for complex queries)

/* Search for all <ol> nodes with class “comment-list”: */
$nodes = $xpath->query( '//ol[@class="comment-list"]' );
/* Remove nodes: */
while( $nodes->length )
{
    $nodes->item(0)->parentNode->removeChild( $nodes->item(0) );
}

/* Output modified HTML: */
echo $dom->saveHTML();

是的，这是 7 行对 1 行，但我建议您这样做。正则表达式是伟大的发明，但不是为了HTML/XML.

阅读更多关于 DOMDocument
阅读更多关于 DOMXPath
阅读why you can't parse [X]HTML with regular expressions

Answer 2

正如我在此页面上的小评论中所说，@HamZa 的评论实际上是这里唯一有用的信息：将 s 修饰符添加到您的正则表达式中，以便它匹配换行符。

preg_replace("/<ol class=\"comment-list\">.*?<\/ol>/s", "", $string);

建议您不要使用正则表达式解析 (x)HTML。但是这里的问题非常简单，只是询问如何将换行符与 preg_replace 匹配。这就是你的做法。

PHP preg_replace html 标签包含换行符

PHP preg_replace html tag containing line breaks

php

regex

preg-replace