在正则表达式 JAVA 中使用模式匹配器时如何获取剩余的不匹配字符串？

Question

我正在从“8=XXX 到 10=XXX”开始的连续字符串缓冲区中获取数据。假设第一次缓冲区扫描的字符串是：下面是我在一次扫描中得到的整个字符串。

8=FIX.4.2|9=00815|35=W|49=TT_PRICE|56=SAP0094X|10=134| 
8=FIX.4.2|9=00816|35=W49=TT_PRICE  ----------------here I didn't get the full string

现在我想要从“8=xxx”开始到“10=xxx|”结束的字符串.我已经为此编写了一个程序，并且运行良好。现在的问题是，当我传递上面的字符串进行匹配时，我只得到恰好从“8=xxx 到 10=xxx”开始的字符串，而另一部分不匹配的部分就被吐了出来。我也想要剩下的部分。

|56=SAP0094X|10=134|-------This is the remaining part of the above vomited string
8=FIX.4.2|9=00815|35=W|49=TT_PRICE|56=SAP0094X|10=134|

在下一次缓冲区扫描中，我将得到字符串，它是模式匹配时吐出字符串的剩余部分。现在看，第一次搜索时吐出来的字符串是

8=FIX.4.2|9=00816|35=W49=TT_PRICE

而下一次搜索的吐字串是

|56=SAP0094X|10=134|

这两个字符串都需要像

一样附加

8=FIX.4.2|9=00816|35=W49=TT_PRICE|56=SAP0094X|10=134|

这是完整的字符串。

下面是我的代码：

String text = in.toString(CharsetUtil.UTF_8); //in is a reference to ByteBuf
     Pattern r = Pattern.compile("(8=\w\w\w)[\s\S]*?(10=\w\w\w)");
     Matcher m = r.matcher(text);

      while (m.find()) {
          String message = m.group();
          // I need to get the remaining not matched string and has to be appended to the not matched string in the next search so that I will be getting the whole string starting from "8=xxx to 10=xxx|"
          System.out.println("Incoming From Exchange >> "+message);
      }

Answer 1

您可以为此使用群组：

public static void main(String[] args) {
    String someInput = "XXX-payload-YYY-some-tail";
    Pattern r = Pattern.compile("(XXX)(.*)(YYY)(.*)");
    Matcher m = r.matcher(someInput);

    if (m.matches()) {
        System.out.println("initial token: " + m.group(1));
        System.out.println("payload: " + m.group(2));
        System.out.println("end token: " + m.group(3));
        System.out.println("tail: " + m.group(4));
    }
}

输出：

initial token: XXX 
payload: -payload- 
end token: YYY
tail: -some-tail

然后您可以将 "tail" 与第二次扫描的结果连接起来并再次解析它

在正则表达式 JAVA 中使用模式匹配器时如何获取剩余的不匹配字符串？

How to get the remaining not matched string when using Pattern Matcher in regex JAVA?

java

regex

buffer