PCRE 匹配（后视）与由斜线和行尾分隔的捕获组

Question

我的目标是在以下字符串中的 propertyComplexity 之后捕获 {type} 和 {code} 的正则表达式：-

/json/score/propertyComplexity/{type}/{code}

{type} 和 {code} 是变量，可以是任何东西或什么都不是，例如

/json/score/propertyComplexity

我从以下表达式开始：-

(?<=propertyComplexity)\/(.*)\/|$

但这只会捕获两个斜线之间的{type}；捕获组的结束分隔符需要是斜杠或行尾。

正则表达式需要捕获 propertyComplexity 之后斜线之后的所有单词，直到被行尾终止。例如：

/json/score/propertyComplexity/{type}/{code}/{param3}/{param4}

应该产生 4 个匹配/捕获组； {type}、{code}、{param3}、{param4}

如果有帮助，这是在 WADL 中处理 @path 属性的上下文中。捕获的内容实际上是不相关的，因为它是将使用的匹配计数（在确定要调用哪个 WADL 资源时与传递的参数计数进行比较）。

URL 有效性不是要求。

当 WADL 中的参数将始终封装在 {param} 占位符中时，不需要在斜杠上或斜杠后进行匹配。

所以我只是在 preg_match_all() == count($this->passedArgs)

中使用以下表达式

/{(.*?)}/

感谢大家的贡献。答案被授予 Jaytea。

Answer 1

这是一个将正则表达式与 explode() 相结合的解决方案，因为我很确定不可能使用 PCRE 单独捕获（或计数）重复组。

正则表达式模式期望 /propertyComplexity 之后的任何 / 分隔的段（我在斜线前面加上斜线以更严格一点）是非空的，因此允许任何非空内容，而不仅仅是像 {type}.

这样被大括号括起来的内容

模式比它可能需要的复杂一点，但它使结果的分解更简短（不需要 trim 斜线）。

实际参数值将在数组 $arguments 中，但为了更简洁，我没有在结果中显示这些值。

$urls = array(
  '/json/score/propertyComplexity',
  '/json/score/propertyComplexity/',
  '/json/score/propertyComplexity//', // invalid
  '/json/score/propertyComplexity/{type}',
  '/json/score/propertyComplexity/{type}/',
  '/json/score/propertyComplexity/{type}//', // invalid
  '/json/score/propertyComplexity/{type}/{code}',
  '/json/score/propertyComplexity/{type}/{code}/{param3}',
  '/json/score/propertyComplexity/{type}/{code}/{param3}/{param4}',
  '/json/score/propertyComplexity/{type}/{code}/{param3}/{param4}/{param5}'
);

foreach( $urls as $url ) {
  printf( 'testing %s' . PHP_EOL, $url );
  if( preg_match( '~(?<=/propertyComplexity)(?:/(?<arguments>[^/]+(/[^/]+)*))?(?:/?$)~', $url, $matches ) ) {
    $arguments = isset( $matches[ 'arguments' ] ) ? explode( '/', $matches[ 'arguments' ] ) : array();
    printf( '  URL is valid: argument count is %d' . PHP_EOL, count( $arguments ) );
  }
  else {
    echo '  URL is invalid' . PHP_EOL;
  }
  echo PHP_EOL;
}

^{View this example on eval.in}

结果：

testing /json/score/propertyComplexity
  URL is valid: argument count is 0

testing /json/score/propertyComplexity/
  URL is valid: argument count is 0

testing /json/score/propertyComplexity//
  URL is invalid

testing /json/score/propertyComplexity/{type}
  URL is valid: argument count is 1

testing /json/score/propertyComplexity/{type}/
  URL is valid: argument count is 1

testing /json/score/propertyComplexity/{type}//
  URL is invalid

testing /json/score/propertyComplexity/{type}/{code}
  URL is valid: argument count is 2

testing /json/score/propertyComplexity/{type}/{code}/{param3}
  URL is valid: argument count is 3

testing /json/score/propertyComplexity/{type}/{code}/{param3}/{param4}
  URL is valid: argument count is 4

testing /json/score/propertyComplexity/{type}/{code}/{param3}/{param4}/{param5}
  URL is valid: argument count is 5

Answer 2

你可以使用preg_match_all结合下面的表达式来实现你想要的：

/(?(?=^).*?propertyComplexity(?=(?:\/[^\/]+)*\/?$)|\G)\/\K([^\/]+)/

'\G'断言匹配主题中的第一个位置，在这种情况下实际上意味着它匹配最后一个匹配结束的位置。这里的基本策略是在开始时验证字符串的格式，然后在后续轮次中一次捕获一个属性。

我不确定你希望验证有多严格，所以我保持简单。

PCRE 匹配（后视）与由斜线和行尾分隔的捕获组

PCRE match after (lookbehind) with capture groups delimited by slashes and end of line

php

regex

pcre