正则表达式未按预期工作:'/=(\w+\s*)+=/'

Regular expression doesn't work as expected: '/=(\w+\s*)+=/'

这是我的:

<?php

preg_match_all('/=(\w+\s*)+=/', 'aaa =bbb ccc ddd eee= zzz', $match);
print_r($match);

它只匹配 eee:

Array
(
    [0] => Array
        (
            [0] => =bbb ccc ddd eee=
        )

    [1] => Array
        (
            [0] => eee
        )

)

我需要它来匹配 bbb、ccc、ddd、eee,例如:

...
   [1] => Array
        (
            [0] => bbb
            [1] => ccc
            [2] => ddd
            [3] => eee
        )
...

问题出在哪里?

试试这个正则表达式:

(\w+)(?=[^=]*=[^=]*$)

解释:

(\w+)          # group all words
(?=            # only if right after can be found:
    [^=]*      # regardless of non '=' character
    =          # one '=' character
    [^=]*$     # non '=' character till the end makes sure the first words are eliminated... You can try remove it from regex101 to see what happens.
)

Regex live here.

希望对您有所帮助。

您的正则表达式以 = 开头和结尾,因此唯一可能的匹配项是:

=bbb ccc ddd eee=

这是预期的行为。组捕获在重复时被覆盖。

1 group, 1 capture

与其尝试在 1 次匹配尝试中获得它们,不如在每次尝试中匹配一个令牌。使用\G匹配the end of last match.

像这样的东西应该可以工作:

/(?(?=^)[^=]*+=(?=.*=)|\G\s+)([^\s=]+)/

regex101 Demo


正则表达式分解

  • (?(?=^) ... | ... ) IF 在字符串的开头
    • [^=]*+= 消耗完第一个 =
    • (?=.*=) 并检查是否还有收尾 =
  • 其他
    • \G\s+ 仅当最后一场比赛在这里结束时才匹配,消耗前面的空格
  • ([^\s=]+) 匹配 1 个令牌,在组 1 中捕获。

如果您还对匹配一组以上的标记感兴趣,则还需要匹配组之间的文本:

/(?(?=^)[^=]*+=(?=.*=)|\G\s*+(?:=[^=]*+=(?=.*=))?)([^\s=]+)/

regex101 Demo

您可以使用preg_replace with preg_split,即:

$string = "aaa =bbb ccc ddd eee= zzz";
$matches = preg_split('/ /',  preg_replace('/^.*?=|=.*?$/', '', $string));
print_r($matches);

输出:

Array
(
    [0] => bbb
    [1] => ccc
    [2] => ddd
    [3] => eee
)

演示:

http://ideone.com/pAmjbk