如何使 catch 组 (.+?) 在 Cucumber Java 中按预期工作？

Question

我在为功能文件编写 Cucumber Java 步骤时观察到以下内容：

专题文件：

Then I get result one <result1> and result two <result2> from microservice

Java 步骤（步骤定义）

@Then("^I get result one(.+?) and result two(.+?)$")  //step function 1
public void i_get_result_one_and_result_two(String result1, String result2)
        throws Throwable {}

@Then("^I get result one(.+?) and result two(.+?) from microservice$")  //step function 2
public void i_get_result_one_and_result_two_from_ms(String result1, String result2)
        throws Throwable {}

特征文件总是映射到步进函数 1 而从不映射到步进函数 2

据我了解，捕获组 (.+?) 的定义是匹配 1 个或多个任何内容（我假设只匹配特征文件中的变量）。我不明白为什么它与步进函数 2 不匹配。

为什么会发生这种情况，我该如何解决这个问题？

Answer 1

你说得对，模式 (.+?) 匹配一组任意字符，出现一次或多次（使用 reluctant quantifier）。该组结束于字符串的结尾（(.+?)$ 部分的 $）。

两个字符串的模式 ^I get result one(.+?) and result two(.+?)$ 匹配。我把匹配的部分放在括号里。

^I get result one( <result1>) and result two( <result2> from microservice)$
^I get result one( <result1>) and result two( <result2>)$

您可以重新措辞您的步骤，使模式不会匹配两个句子，或者您将步骤中的变量字段括起来，例如用单引号 ~~（它必须是一个永远不会出现在匹配值）~~，分别修改模式

它可能看起来像

// steps
Then I get result one '<result1>' and result two '<result2>' from microservice
Then I get result one '<result1>' and result two '<result2>'

// glue code
@Then("^I get result one '(.+?)' and result two '(.+?)' from microservice$")
@Then("^I get result one '(.+?)' and result two '(.+?)'$")

编辑这里有一些更详细的解释匹配是如何工作的。

首先解释了模式 (.+?) 和 ([^']+?)。 ? 限定符意味着搜索从左到右吃掉字符（参见 link 不情愿的限定符）。

^我得到结果一 '(.+?)' 和结果二 '(.+?)'$

^ --- begin of the line
I get result one ' --- a fixed sequence
(.+?) --- any character, one or more times (group 1)
' and result two ' --- a fixed sequence
(.+?) --- any character, one or more times (group 2)
' --- a fixed sequence
$ --- end of the line

group 1 和 group 2 可以包含任何字符，包括 '.

^我得到结果一 '([^']+?)' 和结果二 '([^']+?)'$

^ --- begin of the line
I get result one ' --- a fixed sequence
([^']+?) --- any character, except the single quote, one or more times (group 1)
' and result two ' --- a fixed sequence
([^']+?) --- any character, except the single quote, one or more times (group 2)
' --- a fixed sequence
$ --- end of the line

一旦 group 1 或 group 2 包含 '，该行将不再匹配。
例如I get result one '<O'Reilly>' and result two '<result2>'
因为 group 1 将是 <O 然后模式需要固定序列 ' and result two ' 与 'Reilly>' ....

不匹配

一些演示片段

Pattern pattern = Pattern.compile("^I get result one '(.+?)' and result two '(.+?)'$");
Matcher matcher = pattern.matcher("I get result one '<result1>' and result two '<result>'");
while (matcher.find()) {
    for (int i = 0; i <= matcher.groupCount(); i++) {
        System.out.printf("group: %d  subsequence: %s%n", i, matcher.group(i));
    }
}

输出

group: 0  subsequence: I get result one '<result1>' and result two '<result2>'
group: 1  subsequence: <result1>
group: 2  subsequence: <result2>

group 0被整个表达式捕获

Pattern pattern = Pattern.compile("^I get result one '(.+?)' and result two '(.+?)'$");
Matcher matcher = pattern.matcher("I get result one '<O'Reilly>' and result two '<result2>'");

输出

group: 0  subsequence: I get result one '<O'Reilly>' and result two '<result2>'
group: 1  subsequence: <O'Reilly>
group: 2  subsequence: <result2>

group 1 也匹配 ' 因为 (.+?) 嵌入在前后的固定序列之间。

现在是排除周围字符的模式。

Pattern pattern = Pattern.compile("^I get result one '([^']+?)' and result two '([^']+?)'$");
Matcher matcher = pattern.matcher("I get result one '<result1>' and result two '<result2>'");

输出

group: 0  subsequence: I get result one '<result1>' and result two '<result2>'
group: 1  subsequence: <result1>
group: 2  subsequence: <result2>

与模式 (.+?) 没有区别，因为应由 group 1 或 group 2 捕获的值不包含 '.

Pattern pattern = Pattern.compile("^I get result one '([^']+?)' and result two '([^']+?)'$");
Matcher matcher = pattern.matcher("I get result one '<O'Reilly>' and result two '<result2>'");

没有输出，因为模式与行不匹配（见上面的解释）。这也意味着 Cucumber 将无法找到相关的粘合方法。

假设步骤在特征文件中定义为

Then I get result one '<O'Reilly>' and result two '<result2>'

并且胶水方法被注解为

@Then("^I get result one '([^']+?)' and result two '([^']+?)'$")

运行 Cucumber 会引发以下异常

cucumber.runtime.junit.UndefinedThrowable: The step "I get result one '<O'Reilly>' and result two '<result2>'" is undefined

Answer 2

问题说明

由于您在此处使用正则表达式来匹配您的步骤，因此应注意 .+ 尽可能多次匹配任何字符（至少匹配 1 个字符）。

这本身意味着您的步骤：

^I get result one (.+?) and result two (.+?)$

从最后一个捕获组开始 matching everything。

回答

如果你想让捕获组只匹配引号内的内容，你应该改用：

^I get result one '([^']+?)' and result two '([^']+?)'$

这里，[^']+表示尽可能多次匹配任何非单引号/撇号的字符（最少匹配1个字符）

（您也可以使用双引号代替单引号）

如何使 catch 组 (.+?) 在 Cucumber Java 中按预期工作？

How to make catch group (.+?) work as expected in Cucumber Java?

regex

cucumber

gherkin

cucumber-jvm