PHP 正则表达式删除标签内的所有内容
PHP regex to remove everything inside a tag
我有一个包含锚标签的字符串。这些锚标签包含一些 html 和如下文本:
<a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
<img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
My HTML Link
<label>Click here to view
<cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
</label>
</a>
我的字符串是这样的:
<p>Hello there,</p>
<p><a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
<img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
My HTML Link
<label>Click here to view
<cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
</label>
</a>
what's up.
</p>
<p>
Click here <a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
<img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
My HTML Link
<label>Click here to view
<cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
</label>
</a> to view my pic.
</p>
我必须用字符串中的 href 替换锚标签,这样字符串就会像:
<p>Hello there,</p>
<p>https://example.com/my-blog
what's up.
</p>
<p>
Click here https://example.com/my-blog to view my pic.
</p>
我试过下面的代码,但它没有用它的 href 替换标签:
$dom = new DomDocument();
$dom->loadHTML( $text );
$matches = array();
foreach ( $dom->getElementsByTagName('a') as $item ) {
$matches[] = array (
'a_tag' => $dom->saveHTML($item),
'href' => $item->getAttribute('href'),
'anchor_text' => $item->nodeValue
);
}
foreach( $matches as $match )
{
// Replace a tag by its href
$text = str_replace( $match['a_tag'], $match['href'], $text );
}
return $text;
有谁知道可以这样做吗
我们可以尝试为此使用正则表达式。将以下模式替换为捕获组:
<a.*?href="([^"]*)".*?>.*?<\/a>
使用preg_replace
我们可以重复匹配上面的模式,并将锚标签替换为标签内的捕获href
URL。
$result = preg_replace('/<a.*?href="([^"]*)".*?>.*?<\/a>/s', '', $string);
仔细注意 /pattern/s
末尾的 s
标志。这会在 DOT ALL 模式下进行替换,这意味着点也将匹配换行符(即跨行,这就是你想要的)。
搜索此正则表达式:
<a.*?href="([^"]*)"[^>]*>
并将其替换为
我有一个包含锚标签的字符串。这些锚标签包含一些 html 和如下文本:
<a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
<img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
My HTML Link
<label>Click here to view
<cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
</label>
</a>
我的字符串是这样的:
<p>Hello there,</p>
<p><a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
<img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
My HTML Link
<label>Click here to view
<cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
</label>
</a>
what's up.
</p>
<p>
Click here <a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
<img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
My HTML Link
<label>Click here to view
<cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
</label>
</a> to view my pic.
</p>
我必须用字符串中的 href 替换锚标签,这样字符串就会像:
<p>Hello there,</p>
<p>https://example.com/my-blog
what's up.
</p>
<p>
Click here https://example.com/my-blog to view my pic.
</p>
我试过下面的代码,但它没有用它的 href 替换标签:
$dom = new DomDocument();
$dom->loadHTML( $text );
$matches = array();
foreach ( $dom->getElementsByTagName('a') as $item ) {
$matches[] = array (
'a_tag' => $dom->saveHTML($item),
'href' => $item->getAttribute('href'),
'anchor_text' => $item->nodeValue
);
}
foreach( $matches as $match )
{
// Replace a tag by its href
$text = str_replace( $match['a_tag'], $match['href'], $text );
}
return $text;
有谁知道可以这样做吗
我们可以尝试为此使用正则表达式。将以下模式替换为捕获组:
<a.*?href="([^"]*)".*?>.*?<\/a>
使用preg_replace
我们可以重复匹配上面的模式,并将锚标签替换为标签内的捕获href
URL。
$result = preg_replace('/<a.*?href="([^"]*)".*?>.*?<\/a>/s', '', $string);
仔细注意 /pattern/s
末尾的 s
标志。这会在 DOT ALL 模式下进行替换,这意味着点也将匹配换行符(即跨行,这就是你想要的)。
搜索此正则表达式:
<a.*?href="([^"]*)"[^>]*>
并将其替换为