由于负前瞻的位置导致匹配差异？

Question

我对正则表达式有很多困惑，我正在努力解决它们。这里我有以下字符串：

{start}do or die{end}extended string

我的两个不同的正则表达式，我只改变了点的位置：

(.(?!{end}))* //returns: {start}do or di
                                      //^ See here
((?!{end}).)* //returns: {start}do or die
                                      //^ See here

为什么第一个正则表达式吃掉最后一个 "e"？

还有这种负前瞻如何使这个 * 量词不贪婪？我的意思是为什么它不能消耗超出 {end} 的字符？

Answer 1

你说你的否定前瞻，不可能匹配正则表达式，在你的情况下是：{end}。并且 . 捕获除了新行之外的所有内容。

所以你的第一个正则表达式：

(.(?!{end}))*

它省略了 e，因为：e{end} 由于否定前瞻而无法匹配。在你的第二个正则表达式中，你在另一边有点的地方可以直到： {end}d 所以 e 包含在你的第二个正则表达式中。

Answer 2

我已经为完成任务的两个正则表达式计算了正则表达式引擎的工作流程...

首先，(.(?!{end}))* 正则表达式引擎的方法如下...

"{start}do or die{end}extended string"
^   .(dot) matches "{" and {end} tries to match here but fails.So "{" included
"{start}do or die{end}extended string"
 ^  . (dot) matches "s" and {end} tries to match here but fails.So "s" included

....
....so on...
"{start}do or die{end}extended string"
               ^ (dot) matches "e" and {end} here matches "{end}" so "e" is excluded..
so the match we get is "{start}do or di"

对于第二个正则表达式 ((?!{end}).)*....

"{start}do or die{end}extended string"
^ {end} regex tries to match here but fails to match.So dot consumes "{".

"{start}do or die{end}extended string"
 ^ {end} regex tries to match here but fails again.So dot consumes "s".

....
..so on..
"{start}do or die{end}extended string"
               ^   {end} regex tries to match here but fails.So dot consumes the "e"
"{start}do or die{end}extended string"
                ^   {end} regex tries to match here and succeed.So the whole regex fail here.

So we ended up with a match which is "{start}do or die"

由于负前瞻的位置导致匹配差异？

difference in match due to the position of negative lookahead?

javascript

php

regex

string

lookahead