使用正则表达式获取标签内的引​​号

get quotation marks inside tag with regex

你好。我试图在特定的开始-结束字符串中获取所有引号。 假设我有这个字符串:

`Hello "world". [start]this is a "mark"[end]. It should work with [start]"several" "marks"[end]`

现在我希望 [start] .. [end] 中的每个 " 都替换为 ":

$string = 'Hello "world". [start]this is a "mark"[end]. It should work with [start]"several" "marks"[end]';
$regex = '/(?<=\[start])(.*?)(?=\[end])/';
$replace = '&quot;';

$string = preg_replace($regex,$replace,$string);

这与 [start] 和 [end] 之间的文本匹配。但是我想匹配其中的 ":

//expected: Hello "world". [start]this is a &quot;mark&quot;[end]. It should work with [start]&quot;several&quot; &quot;marks&quot;[end]

有什么想法吗?

(?s)"(?=((?!\[start\]).)*\[end\])

Live demo

解释:

 (?s)                       DOT_ALL modifier
 "                          Literal "
 (?=                        Begin lookahead
      (                         # (1 start)
           (?! \[start\] )          Current position should not be followed by [start]
           .                        If yes then match
      )*                        # (1 end)
      \[end\]                   Until reaching [end]
 )                          End lookahead

PHP live demo

使用 preg_replace_callback 的方法允许使用更简单的正则表达式(考虑到您的字符串总是有成对的非嵌套 [start]...[end] 对):

$string = 'Hello "world". [start]this is a "mark"[end]. It should work with [start]"several" "marks"[end]';
$regex = '/\[start].*?\[end]/s';
$string = preg_replace_callback($regex, function($m) {
    return str_replace('"', '&quot;', $m[0]);
},$string);
echo $string;
// => Hello "world". [start]this is a &quot;mark&quot;[end]. It should work with [start]&quot;several&quot; &quot;marks&quot;[end]

PHP IDEONE demo

'/\[start].*?\[end]/s' 正则表达式匹配 [start],然后是任何 0+ 个字符(包括换行符,因为使用了 /s DOTALL 修饰符,然后是 [end]

如果您需要确保第一个 [start][end] 之间的最短 window,您将需要使用带有缓和贪婪标记的正则表达式,如 Revo 的回答:'/\[start](?:(?!\[(?:start|end)]).)*\[end]/s'(参见 PHP demo and a regex demo)。