在可能嵌套时获取两个定界符之间的匹配项

Get matches between two delimiters when potentially nested

我的案例的特定分隔符是左括号和右括号。当没有嵌套时,我可以得到它们之间的文本如下:

$input = 'sometext(moretext)andmoretext(somemoretext)andevenmoretext(andmore)';
preg_match_all('#\((.*?)\)#', $input, $match);
echo('<pre>'.print_r($match[1],1).'</pre>');

Array
(
    [0] => moretext
    [1] => somemoretext
    [2] => andmore
)

然而,当我有嵌套字符时,我 运行 遇到一些障碍,并得到以下内容。

$input = 'sometext(moretext)andmoretext(somemore(with(bitof(littletext)text)more(andmore)text)text)andevenmoretext(andmore)';
preg_match_all('#\((.*?)\)#', $input, $match);
echo('<pre>'.print_r($match[1],1).'</pre>');

Array
(
    [0] => moretext
    [1] => somemore(with(bitof(littletext
    [2] => andmore
    [3] => andmore
)

如何 return 分隔符之间的整个字符串:

Array
(
    [0] => moretext
    [1] => somemore(with(bitof(littletext)text)more(andmore)text)text
    [2] => andmore
)

PS。最终,我将使用递归 PHP 对任何也包含括号的顶级匹配执行相同的任务。

您可以使用此 recursive regex pattern 来匹配 (...):

preg_match_all('/\( ( (?: [^()]* | (?R) )* ) \)/x', $input, $m);
print_r($m[1]);

RegEx Demo

(?R) 递归整个模式。

输出:

Array
(
    [0] => moretext
    [1] => somemore(with(bitof(littletext)text)more(andmore)text)text
    [2] => andmore
)

为了做到这一点,这里有一个非正则表达式的解决方案。

function delimeterSplit( $input )
{
    $str = '';
    $output = array();

    $op = 0;
    $cp = 0;

    foreach( str_split( $input ) as $k => $v )
    {
        if( $v === '(' )
        {
            ++$op;
        }
        if( $input[ $k ] === ')' )
        {
            ++$cp;
        }
        if( ( ( $op === 1 && $v !== '(' ) || $op > 1 ) && $op !== $cp )
        {
            $str .= $v;
        }
        if( $op > 0 && $op === $cp )
        {
            $op = 0;
            $cp = 0;
            $output[] = $str;
            $str = '';
        }
    }

    return $output;
}

echo '<pre>'.print_r( delimeterSplit( 'sometext(moretext)andmoretext(somemoretext)andevenmoretext(andmore)' ), true ).'</pre>';

echo '<pre>'.print_r( delimeterSplit( 'sometext(moretext)andmoretext(somemore(with(bitof(littletext)text)more(andmore)text)text)andevenmoretext(andmore)' ), true ).'</pre>';

输出:

Array
(
    [0] => moretext
    [1] => somemoretext
    [2] => andmore
)

Array
(
    [0] => moretext
    [1] => somemore(with(bitof(littletext)text)more(andmore)text)text
    [2] => andmore
)