Matcher.replaceAll() 即使在我转义时也会删除反斜杠。 Java

Matcher.replaceAll() removes backslash even when I escape it. Java

我的应用程序中有一些功能可以替换 json 中的一些文本(我在示例中对其进行了简化)。它们的替换可能包含转义序列,如 \n \b \t 等,当我尝试使用 Jackson 构建 json 时,这些序列可能会破坏 json 字符串。所以我决定使用 Apache 的解决方案 - StringEscapeUtils.escapeJava() 来转义所有转义序列。但 Matcher.replaceAll() 删除由 escapeJava()

添加的反斜杠

有代码:

public static void main(String[] args) {
    String json = "{\"test2\": \"Hello toReplace \\"test\\" world\"}";

    String replacedJson = Pattern.compile("toReplace")
            .matcher(json)
            .replaceAll(StringEscapeUtils.escapeJava("replacement \n \b \t"));

    System.out.println(replacedJson);
}

预期输出:

{"test2": "Hello replacement \n \b \t \"test\" world"}

实际输出:

{"test2": "Hello replacement n b t \"test\" world"}

为什么 Matcher.replaceAll() 删除反斜线而 System.out.println(StringEscapeUtils.escapeJava("replacement \n \b \t")); returns 正确输出 - replacement \n \b \t

如果您想在 replaceAll 中使用反斜杠,您需要将其转义。您可以在文档 here

中找到它

StringEscapeUtils.escapeJava 将转义适合在 Java 源代码中使用的字符串 - 但它不允许您在源代码中使用未转义的字符串。

"replacement \n \b \t"
             ^ new line
                 ^ backspace
                    ^ tab

如果您想要在常规 Java 字符串中使用文字反斜杠,您需要:

"replacement \n \b \t"

因为这是 replaceAll 正则表达式替换部分的 java 字符串,您需要:

"replacement \\n \\b \\t"

尝试:

    String replacedJson = Pattern.compile("toReplace")
            .matcher(json)
            .replaceAll("replacement \\n \\b \\t")

StringEscapeUtils.escapeJava("\n") 允许您将单个换行符 \n 转换为两个字符:\n.

\ 是模式替换中的特殊字符,来自 https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#replaceAll(java.lang.String):

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

要将它们视为文字字符,您需要通过 Matcher.quoteReplacement 将其转义,从 https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#quoteReplacement(java.lang.String):

Returns a literal replacement String for the specified String. This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class. The String produced will match the sequence of characters in s treated as a literal sequence. Slashes (\) and dollar signs ($) will be given no special meaning.

所以在你的情况下:

.replaceAll(Matcher.quoteReplacement(StringEscapeUtils.escapeJava("replacement \n \b \t")))

您还必须使用 Matcher.quoteReplacement() 转义 \

public static String replaceAll(String json, String regex, String replace) {
    return Pattern.compile(regex)
                  .matcher(json)
                  .replaceAll(Matcher.quoteReplacement(StringEscapeUtils.escapeJava(replace)));
}