PHP preg_match 在某些元素中忽略

Question

我正在写一个 regex，我需要在其中过滤内容以格式化其排版。到目前为止，我的代码似乎使用 preg_replace 正确过滤了我的内容，但我无法弄清楚如何避免某些标签中包含的内容出现这种情况，例如 <pre>.

作为参考，这将在 WordPress 的 the_content 过滤器中使用，所以我当前的代码如下所示：

function my_typography( $str ) {
    $ignore_elements = array("code", "pre");

    $rules = array(
        "?" => array("before"=> "&thinsp;", "after"=>""),
        // the others are stripped out for simplicity
    );

    foreach($rules as $rule=>$params) {
        // Pseudo :
        //    if( !in_array( $parent_tag, $ignore_elements) {
        // /Pseudo


        $formatted = $params['before'] . $rule . $params['after'];
        $str = preg_replace( $rule, $formatted, $str );


        // Pseudo :
        //    }
        // /Pseudo
    }

    return $str;
}
add_filter( 'the_content',  'my_typography' );

基本上：

<p>Was this filtered? I hope so</p>
<pre>Was this filtered? I hope not.</pre>

应该变成

<p>Was this filtered&thinsp;? I hope so</p>
<pre>Was this filtered? I hope not.</pre>

Answer 1

您需要在 preg_replace 中用正则表达式分隔符包装搜索正则表达式，并且必须调用 preg_quote 来转义所有特殊的正则表达式字符，例如 ?、.、*、+ 等：

$str = preg_replace( '~' . preg_quote($rule, '~') . '~', $formatted, $str );

完整代码：

function my_typography( $str ) {
    $ignore_elements = array("code", "pre");

    $rules = array(
        "?" => array("before"=> "&thinsp;", "after"=>""),
        // the others are stripped out for simplicity
    );

    foreach($rules as $rule=>$params) {
        // Pseudo :
        //    if( !in_array( $parent_tag, $ignore_elements) {
        // /Pseudo


        $formatted = $params['before'] . $rule . $params['after'];
        $str = preg_replace( '~' . preg_quote($rule, '~') . '~', $formatted, $str );


        // Pseudo :
        //    }
        // /Pseudo
    }

    return $str;
}

输出：

<p>Was this filtered&thinsp;? I hope so</p>
<pre>Was this filtered&thinsp;? I hope not.</pre>

PHP preg_match 在某些元素中忽略

PHP preg_match ignore within certain elements

php

regex

preg-match