从字符串中删除空格、数字和特殊字符
Removing spaces, numbers and special characters from a string
我正在编写一个函数来从作为参数传递的字符串中删除空格。
此代码有效:
public static String removeSpecialChars(String str) {
String finalstr = "";
char[] arr = str.toCharArray();
char ch;
for (int i = 0; i < arr.length; i++) {
ch = arr[i];
if (Character.isLetter(ch))
finalstr = finalstr.concat(String.valueOf(ch));
else
continue;
}
return finalstr;
}
字符串 'hello world!' 的输出如下:
helloworld
但是这个没有:
public static String removeSpecialChars(String str) {
char[] arr = str.toCharArray();
char[] arr2 = new char[str.length()];
char ch;
for (int i = 0; i < arr.length; i++) {
ch = arr[i];
if (Character.isLetter(ch))
arr2[i] = ch;
}
return String.valueOf(arr2);
}
输出:
hello world
我得到了与输出相同的字符串,但只删除了感叹号。这可能是什么原因?任何帮助将不胜感激。
一个char
值只是0到2¹⁶−1范围内的一个数值。在十六进制(基数 16)中,我们将其写为 0000 到 ffff。
因此,知道每个 char
数组都是一个数值序列,让我们在程序进行时查看每个数组的状态。 (为简洁起见,我将每个值显示为两个十六进制数字,而不是四个,因为它们都在 00–ff 范围内。)
char [] arr = str.toCharArray();
// [ 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ]
// (UTF-16 values for the characters in "hello world!")
char [] arr2 = new char[str.length()];
// [ 00 00 00 00 00 00 00 00 00 00 00 00 ]
// (uninitialized arrays are always initialized with zeroes)
char ch;
for (int i = 0; i < arr.length; i++) {
ch = arr[i];
if (Character.isLetter(ch))
arr2[i] = ch;
}
// arr2 after first loop iteration:
// [ 68 00 00 00 00 00 00 00 00 00 00 00 ]
// arr2 after second loop iteration:
// [ 68 65 00 00 00 00 00 00 00 00 00 00 ]
// arr2 after third loop iteration:
// [ 68 65 6c 00 00 00 00 00 00 00 00 00 ]
// arr2 after fourth loop iteration:
// [ 68 65 6c 6c 00 00 00 00 00 00 00 00 ]
// arr2 after fifth loop iteration:
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]
// During sixth loop iteration,
// the if-condition is not met, so arr2[6]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]
// arr2 after seventh loop iteration:
// [ 68 65 6c 6c 6f 00 77 00 00 00 00 00 ]
// During twelfth and final loop iteration,
// the if-condition is not met, so arr2[11]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 77 6f 72 6c 64 00 ]
我不知道你是如何检查返回的字符串的,但这是其中的实际内容:
"hello\u0000world\u0000"
正如 Johnny Mopp 指出的那样,由于要跳过一些字符,因此需要使用两个索引变量,并且在最后创建 String 时,需要使用第二个索引变量来限制字符数你用来创建字符串。
因为 Java 9 你可以使用 codePoints
方法:
public static void main(String[] args) {
System.out.println(removeSpecialChars("hello world!")); // helloworld
System.out.println(removeSpecialChars("^&*abc123_+")); // abc
System.out.println(removeSpecialChars("STRING")); // STRING
System.out.println(removeSpecialChars("Слово_Йй+ёЁ")); // СловоЙйёЁ
}
public static String removeSpecialChars(String str) {
return str.codePoints()
// Stream<Character>
.mapToObj(ch -> (char) ch)
// filter out non-alphabetic characters
.filter(Character::isAlphabetic)
// Stream<String>
.map(String::valueOf)
// concatenate into a single string
.collect(Collectors.joining());
}
另请参阅:
我正在编写一个函数来从作为参数传递的字符串中删除空格。
此代码有效:
public static String removeSpecialChars(String str) {
String finalstr = "";
char[] arr = str.toCharArray();
char ch;
for (int i = 0; i < arr.length; i++) {
ch = arr[i];
if (Character.isLetter(ch))
finalstr = finalstr.concat(String.valueOf(ch));
else
continue;
}
return finalstr;
}
字符串 'hello world!' 的输出如下:
helloworld
但是这个没有:
public static String removeSpecialChars(String str) {
char[] arr = str.toCharArray();
char[] arr2 = new char[str.length()];
char ch;
for (int i = 0; i < arr.length; i++) {
ch = arr[i];
if (Character.isLetter(ch))
arr2[i] = ch;
}
return String.valueOf(arr2);
}
输出:
hello world
我得到了与输出相同的字符串,但只删除了感叹号。这可能是什么原因?任何帮助将不胜感激。
一个char
值只是0到2¹⁶−1范围内的一个数值。在十六进制(基数 16)中,我们将其写为 0000 到 ffff。
因此,知道每个 char
数组都是一个数值序列,让我们在程序进行时查看每个数组的状态。 (为简洁起见,我将每个值显示为两个十六进制数字,而不是四个,因为它们都在 00–ff 范围内。)
char [] arr = str.toCharArray();
// [ 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ]
// (UTF-16 values for the characters in "hello world!")
char [] arr2 = new char[str.length()];
// [ 00 00 00 00 00 00 00 00 00 00 00 00 ]
// (uninitialized arrays are always initialized with zeroes)
char ch;
for (int i = 0; i < arr.length; i++) {
ch = arr[i];
if (Character.isLetter(ch))
arr2[i] = ch;
}
// arr2 after first loop iteration:
// [ 68 00 00 00 00 00 00 00 00 00 00 00 ]
// arr2 after second loop iteration:
// [ 68 65 00 00 00 00 00 00 00 00 00 00 ]
// arr2 after third loop iteration:
// [ 68 65 6c 00 00 00 00 00 00 00 00 00 ]
// arr2 after fourth loop iteration:
// [ 68 65 6c 6c 00 00 00 00 00 00 00 00 ]
// arr2 after fifth loop iteration:
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]
// During sixth loop iteration,
// the if-condition is not met, so arr2[6]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]
// arr2 after seventh loop iteration:
// [ 68 65 6c 6c 6f 00 77 00 00 00 00 00 ]
// During twelfth and final loop iteration,
// the if-condition is not met, so arr2[11]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 77 6f 72 6c 64 00 ]
我不知道你是如何检查返回的字符串的,但这是其中的实际内容:
"hello\u0000world\u0000"
正如 Johnny Mopp 指出的那样,由于要跳过一些字符,因此需要使用两个索引变量,并且在最后创建 String 时,需要使用第二个索引变量来限制字符数你用来创建字符串。
因为 Java 9 你可以使用 codePoints
方法:
public static void main(String[] args) {
System.out.println(removeSpecialChars("hello world!")); // helloworld
System.out.println(removeSpecialChars("^&*abc123_+")); // abc
System.out.println(removeSpecialChars("STRING")); // STRING
System.out.println(removeSpecialChars("Слово_Йй+ёЁ")); // СловоЙйёЁ
}
public static String removeSpecialChars(String str) {
return str.codePoints()
// Stream<Character>
.mapToObj(ch -> (char) ch)
// filter out non-alphabetic characters
.filter(Character::isAlphabetic)
// Stream<String>
.map(String::valueOf)
// concatenate into a single string
.collect(Collectors.joining());
}
另请参阅: