从字符串中删除空格、数字和特殊字符

Removing spaces, numbers and special characters from a string

我正在编写一个函数来从作为参数传递的字符串中删除空格。

此代码有效:

public static String removeSpecialChars(String str) {
    String finalstr = "";
    char[] arr = str.toCharArray();
    char ch;
    for (int i = 0; i < arr.length; i++) {
        ch = arr[i];
        if (Character.isLetter(ch))
            finalstr = finalstr.concat(String.valueOf(ch));
        else
            continue;
    }
    return finalstr;
}

字符串 'hello world!' 的输出如下:

helloworld

但是这个没有:

public static String removeSpecialChars(String str) {
    char[] arr = str.toCharArray();
    char[] arr2 = new char[str.length()];
    char ch;
    for (int i = 0; i < arr.length; i++) {
        ch = arr[i];
        if (Character.isLetter(ch))
            arr2[i] = ch;
    }
    return String.valueOf(arr2);
}

输出:

hello world

我得到了与输出相同的字符串,但只删除了感叹号。这可能是什么原因?任何帮助将不胜感激。

一个char值只是0到2¹⁶−1范围内的一个数值。在十六进制(基数 16)中,我们将其写为 0000 到 ffff。

因此,知道每个 char 数组都是一个数值序列,让我们在程序进行时查看每个数组的状态。 (为简洁起见,我将每个值显示为两个十六进制数字,而不是四个,因为它们都在 00–ff 范围内。)

char [] arr = str.toCharArray();

// [ 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ]
// (UTF-16 values for the characters in "hello world!")

char [] arr2 = new char[str.length()];

// [ 00 00 00 00 00 00 00 00 00 00 00 00 ]
// (uninitialized arrays are always initialized with zeroes)

char ch;
for (int i = 0; i < arr.length; i++) {
    ch = arr[i];
    if (Character.isLetter(ch))
        arr2[i] = ch;
}

// arr2 after first loop iteration:
// [ 68 00 00 00 00 00 00 00 00 00 00 00 ]

// arr2 after second loop iteration:
// [ 68 65 00 00 00 00 00 00 00 00 00 00 ]

// arr2 after third loop iteration:
// [ 68 65 6c 00 00 00 00 00 00 00 00 00 ]

// arr2 after fourth loop iteration:
// [ 68 65 6c 6c 00 00 00 00 00 00 00 00 ]

// arr2 after fifth loop iteration:
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]

// During sixth loop iteration,
// the if-condition is not met, so arr2[6]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]

// arr2 after seventh loop iteration:
// [ 68 65 6c 6c 6f 00 77 00 00 00 00 00 ]

// During twelfth and final loop iteration,
// the if-condition is not met, so arr2[11]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 77 6f 72 6c 64 00 ]

我不知道你是如何检查返回的字符串的,但这是其中的实际内容:

"hello\u0000world\u0000"

正如 Johnny Mopp 指出的那样,由于要跳过一些字符,因此需要使用两个索引变量,并且在最后创建 String 时,需要使用第二个索引变量来限制字符数你用来创建字符串。

因为 Java 9 你可以使用 codePoints 方法:

public static void main(String[] args) {
    System.out.println(removeSpecialChars("hello world!")); // helloworld
    System.out.println(removeSpecialChars("^&*abc123_+"));  // abc
    System.out.println(removeSpecialChars("STRING"));       // STRING
    System.out.println(removeSpecialChars("Слово_Йй+ёЁ"));  // СловоЙйёЁ
}
public static String removeSpecialChars(String str) {
    return str.codePoints()
            // Stream<Character>
            .mapToObj(ch -> (char) ch)
            // filter out non-alphabetic characters
            .filter(Character::isAlphabetic)
            // Stream<String>
            .map(String::valueOf)
            // concatenate into a single string
            .collect(Collectors.joining());
}

另请参阅: