在 PHP 中获取 [embed] 标签的内容

Get content of [embed] tag in PHP

我正在将 post 内容从 WordPress 转换为自己的 CMS。 WP 中的一些插件添加了 [embed] 短代码,现在我需要获取 [embed]...[/embed] 标签的内容以将结构更改为嵌入。 有些 post 有一个嵌入标签,有些更多,有些没有标签。

我尝试使用以下解决方案:

PHP/regex: How to get the string value of HTML tag?

get everything between <tag> and </tag> with php

Php get string between tags

PHP Regex find text between custom added HTML Tags

但仍然无法正常工作,并且 preg_match_all return 空数组或内容错误的数组,不在嵌入标签内。

preg_match_all('/[^embed](.*)[^\/embed]/', $content, $embeds);

来自 WP 的示例内容:

.....
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed faucibus nisi at lacus dignissim vehicula. Phasellus tellus lorem, mattis et porttitor non, vehicula et nisi. Interdum et malesuada fames ac ante ipsum primis in faucibus. Phasellus a aliquet ligula. Aenean malesuada ligula urna, ut vehicula nisi dapibus a. Phasellus fringilla turpis blandit tellus scelerisque posuere. Mauris at dui nisi. Nam at viverra lectus, vel interdum velit. Nullam a risus hendrerit arcu egestas hendrerit. Morbi ut faucibus metus, eu malesuada ipsum. Integer dapibus mollis molestie. [embed]</p>
<div class="tiny-pageembed">
    <iframe src="https://twitter.com/chainlink/status/xxxxxx" width="350px" height="260px" frameborder="0" scrolling="no"></iframe></div>

<p>[/embed] Ut magna sem, consectetur et aliquam vitae, mattis id tellus. Curabitur in risus sed neque condimentum congue ac sit amet ante. Cras eget rutrum justo, at pretium libero. Duis consectetur enim in nisl molestie commodo facilisis nec orci. Praesent vitae ullamcorper arcu. Phasellus aliquet, metus in pulvinar sodales, lorem eros convallis quam, nec pulvinar turpis dolor vel elit. Fusce ornare erat blandit fringilla pellentesque.</p>
.....

WordPress中有一个函数可以直接获取正则表达式进行解析"shortcodes": https://developer.wordpress.org/reference/functions/get_shortcode_regex/

源代码在这里: https://core.trac.wordpress.org/browser/tags/5.2/src/wp-includes/shortcodes.php#L207