PHP \DOMDocument 将 > 转换为 > 并将 & 转换为 &

PHP \DOMDocument converts > to > and & to &

我发送 html 到 \DomDocument 和 \DomDocument 转换所有特殊字符。

我怎么能对 \DomDocument 说不要在 {% ..... %} 之间转换我们的特殊字符

{% if &a > 10 %} 转换为 {% if &a > 10%}

输入

<!DOCTYPE html>
<body>
    {% if &a > 10 %}
        {% print &a %}
    {% end if %}
<img src="{%# image %}" >
<script>
    if a > 10
</script>
</body>

输出

<!DOCTYPE html>
<html><body>
    {% if &amp;a &gt; 10 %}
        {% print &amp;a %}
    {% end if %}
<img src="%7B%# image %%7D" >
<script>
    if a > 10
</script></body></html>

代码

$dom = new \DOMDocument('1.0', 'UTF-8');
$content = '<!DOCTYPE html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}" >
                <script>
                    if a > 10
                </script>
            </body>';
@$dom->loadHTML($content);
echo $dom->saveHTML();

尝试使用 htmlspecialchars:

$dom = new DOMDocument('1.0', 'UTF-8');
$content =  htmlspecialchars('<!DOCTYPE html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}" >
                <script>
                    if a > 10
                </script>
            </body>');
$dom->loadHTML($content);
echo $dom->saveHTML();

输出:

<!DOCTYPE html><body> {% if &a > 10 %} {% print &a %} {% end if %} <img > src="{%# image %}" > <script> if a > 10 </script> </body>

在将 HTML 发送到 DOMDocument 之前,我们应该对特殊数据进行编码,在 Dom 完成解码数据之后。

编码代码

<?php
$dom = new DomDocument();
$content = '<!DOCTYPE html>
<html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}"><script>
                    if a > 10
                </script></body></html>';

$tag_start = '(base64';
$tag_end   = ')';
//MWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMW
// encode data
$pattern = '/({%[^}]+})/ium';
preg_match_all($pattern, $content, $matches);
foreach($matches[0] as $key => $val){
    $base64 = $tag_start.base64_encode($val).$tag_end;
    $content = str_replace($val, $base64, $content);
}

// echo $content;

$dom->loadHTML($content);
$domContent = $dom->saveHTML();

输出

<!DOCTYPE html>
<html><body>
                (base64eyUgaWYgJmEgPiAxMCAlfQ==)
                    (base64eyUgcHJpbnQgJmEgJX0=)
                (base64eyUgZW5kIGlmICV9)
            <img src="(base64eyUjIGltYWdlICV9)"><script>
                if a > 10
            </script></body></html>