Java 中类似 Lisp 的字符串匹配

Question

我在匹配这种格式的字符串时遇到了一些问题：(foo "bar")。准确地说，我想捕获

一个左括号，后跟
零个或多个空白字符，f.b。
至少一个单词字符，f.b
又是空格，零个或多个，f.b。
一个或多个单词char，用双引号括起来，f.b
可选空格和右括号。

接下来我想提取 foo 和 bar，但这是另一个问题。我想出的最好的办法是 \( [\s]? [\w]+ [\s]? \" [\w]+ \" [\s]? \)，而且我一直在使用 online resource 来检查我的正则表达式。

你能指出我的正则表达式有什么问题吗？

Answer 1

您的正则表达式中有其他 space 个字符导致模式不匹配。也不需要方括号。问号表示零次或一次出现 但不会更多 。要标记零个或多个，您应该使用 *。下面将使用括号中的两个匹配组来匹配字符串和两个组 foo 和 bar：

Pattern pattern = Pattern.compile("\(\s*(\w+)\s*\"(\w*)\"\s*\)");
Matcher matcher = pattern.matcher("(foo \"bar\")");
if(matcher.find()) {
    System.out.println(matcher.group(1));    // foo
    System.out.println(matcher.group(2));    // bar
}

Answer 2

您不需要像 \w 或 \s 那样用 [ 将 character classes 括起来 ]，[\s] 与 \s （只有当你应该用 [ ] 括起来的情况是你想创建单独的字符 class 组合已经存在的字符 class 像 [\s\d] 代表空格或数字的字符）。
此外，默认情况下正则表达式中包含空格，因此 "\s " 将匹配两个空格，一个用于 \s，一个用于 </code>。 </li> <li><em>零次或多次</em>”表示为<code>*，?表示零次或一次
如果您想将正则表达式写成字符串，您还需要通过在 \ 之前添加另一个 \ 来转义 \

所以请尝试使用以下代表

的正则表达式 "\(\s*\w+\s*\"[\w]+\"\s*\)"

\(         - 1. An opening parenthesis
   \s*     - 2. Zero or more whitespace chars
   \w+     - 3. At least one word character
   \s*     - 4. Whitespace again, zero or more
   \"       - 5. opening quotation
   \w+     - 5. One or more char - I am not sure which symbols you want to add here
                 but you can for instance add them manually with [\w+\-*/=<>()]+
   \"       - 5. closing quotation
   \s*     - 6. Optional whitespace
\)         - 6. closing parenthesis

现在，如果你想获得匹配文本的某些部分，你可以使用 groups（你想要用未转义的括号匹配的包围部分）就像在正则表达式 \w+ (\w+) 的情况下，它会找到成对的单词，但第二个单词将放在组中（索引为 1）。要获取该组的内容，您只需使用 Matcher 实例中的 group(index)：

Pattern pattern = Pattern.compile("\w+ (\w+)");
Matcher matcher = pattern.matcher("ab cd efg hi jk");

while (matcher.find()) {
    System.out.println("entire match =\t"+matcher.group());
    System.out.println("second word =\t"+matcher.group(1));
    System.out.println("---------------------");
}

输出：

entire match =  ab cd
second word =   cd
---------------------
entire match =  efg hi
second word =   hi
---------------------

Java 中类似 Lisp 的字符串匹配

Lisp-like string matching in Java

java

regex

string