Java trim 符号过多

Question

如何 trim 过多的非数字、非字母字符，如下所示：

String test = "Hey this is a string with lots of symbols!!!!!@@@@@#####"

输出应该是：

Hey this is a string with lots of symbols!@#

我目前拥有的是这个，但它有一些奇怪的副作用，而且它太笨重了：

（第一个目标只是 trim 他们，第二个目标是使其成为 2-3 班轮）

    String precheck = message.replaceAll("[a-zA-Z]", "");

    precheck = precheck.replaceAll("[0-9]+/*\.*[0-9]*", "");
    precheck = precheck.trim();

    String[] allowed = {
            "!","\"","'","-",">","<","+","_"+"^","@","#","=","/","\"
    };

    for(char c : precheck.toString().toCharArray())
    {
        boolean contains = false;
        for(String symbol : allowed)
        {
            if(c == symbol.toCharArray()[0]){
                contains = true;
            }
        }

        if(!contains){
            message = message.replace(String.valueOf(c), "");
            message = message.trim();
        }
    }

    for(String symbol : allowed)
    {
        if (message.contains(symbol)){
            int count = 0;

            for (int i = 0; i < message.length(); i++){
                if (message.charAt(i) == symbol.toCharArray()[0]){
                    count++;
                }
            }

            if(count > 2){
                for(int i = 0;i < (count-2);i++){
                    message = message.replaceFirst(symbol, "");
                }
            }
        }
    }

    return message;

Answer 1

您可以只使用此正则表达式替换：

str = str.replaceAll("([^\p{L}\p{N}])\1+", "");

RegEx Demo

解释： 此正则表达式匹配任何 非数字、非字母字符 并将其捕获为匹配组 #1。然后，正则表达式使用 + 匹配同一捕获字符的 1 个或多个实例，并将它们替换为第一部分，即 </code>. PS: 这个前瞻性正则表达式也可以工作： <pre><code>str = str.replaceAll("([^\p{L}\p{N}])(?=\1+)", "");

Answer 2

既然你已经定义了白名单，我会推荐这种方法：匹配所有重复的允许符号字符，保留第一个。

([!"'><+_^@#=/\-])+

在Java

String test = "Hey this is a string with lots of symbols!!!!!@@@@@#####";

test = test.replaceAll("([!"'><+_^@#=/\\-])\1+", "");

结果

"Hey this is a string with lots of symbols!@#"

Java trim 符号过多

Java trim excessive symbols

java

regex

RegEx Demo