带有“?:”的模式导致意外的 preg_match

Unexpected preg_match result from pattern with "?:"

我试试这个模式

(?:(\d+)\/|)reports\/(\d+)-([\w-]+).html

使用此字符串(preg_match 带有修饰符 "Axu")

reports/683868-derger-gergewrger.html

我期待这个匹配的结果 (https://regex101.com/r/kX6yZ5/1):

[1] => 683868
[2] => derger-gergewrger

但我明白了:

[1] => 
[2] => 683868
[3] => derger-gergewrger

为什么?空值(1)在哪里,因为模式不应该捕获“?:”


我有两种情况:

  1. "reports/683868-derger-gergewrger.html"
  2. "757/reports/683868-derger-gergewrger.html"

在第一种情况下,我需要两次捕获,但在第二种情况下,我需要三次捕获。

您可以使用:

preg_match('~(?:\d+/)?reports/(\d+)-([\w-]+)\.html~', 
           'reports/683868-derger-gergewrger.html', $m);
print_r($m);
Array
(
    [0] => reports/683868-derger-gergewrger.html
    [1] => 683868
    [2] => derger-gergewrger
)

编辑:您可能想要这种行为:

$s = '757/reports/683868-derger-gergewrger.html';
preg_match('~(?|(\d+)/reports/(\d+)-([\w-]+)\.html|reports/(\d+)-([\w-]+)\.html)~',
           $s, $m); print_r($m);Array
(
    [0] => 757/reports/683868-derger-gergewrger.html
    [1] => 757
    [2] => 683868
    [3] => derger-gergewrger
)

和:

$s = 'reports/683868-derger-gergewrger.html';

preg_match('~(?|(\d+)/reports/(\d+)-([\w-]+)\.html|reports/(\d+)-([\w-]+)\.html)~',
             $s, $m); print_r($m);
Array
(
    [0] => reports/683868-derger-gergewrger.html
    [1] => 683868
    [2] => derger-gergewrger
)

(?|..) 非捕获组 。在此构造的每个备选方案中声明的子模式将从同一索引重新开始。