拆分字符串 space 分隔
split String by space separated
我需要用 space 分割单词 java,所以我使用了 .split
函数来实现,如下所示
String keyword = "apple mango ";
String keywords [] = keyword .split(" ");
上面的代码工作正常,但唯一的问题是我的关键字有时会包含关键字 "jack fruit" , "ice cream" 双引号如下所示
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
在这种情况下,我需要得到 4 个词,例如 apple、mango、jack fruit, ice cream in keywords array
谁能告诉我一些解决方案
此解决方案有效,但我确信这不是性能/资源的最佳选择。当你有超过两个单词的水果时,它也有效。欢迎编辑或优化我的代码。
public static void main(String[] args) {
String keyword = "apple mango \"jack fruit\" \"ice cream\" \"one two three\"";
String[] split = custom_split(keyword);
for (String s : split) {
System.out.println(s);
}
}
private static String[] custom_split(String keyword) {
String[] split = keyword.split(" ");
ArrayList<String> list = new ArrayList<>();
StringBuilder temp = new StringBuilder();
boolean multiple = false;
for (String s : split) {
if (s.startsWith("\"")) {
multiple = true;
s = s.replaceAll("\"", "");
temp.append(s);
continue;
}
if (s.endsWith("\"")) {
multiple = false;
s = s.replaceAll("\"", "");
temp.append(" ").append(s);
list.add(temp.toString());
temp = new StringBuilder();
continue;
}
if (multiple) {
temp.append(" ").append(s);
} else {
list.add(s);
}
}
String[] result = new String[list.size()];
for (int i = 0; i < list.size(); i++) {
result[i] = list.get(i);
}
return result;
}
你不能用 String.split()
做到这一点。您需要为目标标记提出一个正则表达式,并通过匹配器收集它们,如下所示:
final Pattern token = Pattern.compile( "[^\"\s]+|\"[^\"]*\"" );
List<String> tokens = new ArrayList<>();
Matcher m = token.matcher( "apple mango \"jack fruit\" \"ice cream\"" );
while( m.find() )
tokens.add( m.group() );
我会用一个正则表达式和两个捕获组来完成,每个模式一个。我不知道任何其他方式。
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
Pattern p = Pattern.compile("\"?(\w+\W+\w+)\"|(\w+)");
Matcher m = p.matcher(keyword);
while (m.find()) {
String word = m.group(1) == null ? m.group(2) : m.group(1);
System.out.println(word);
}
这将拆分引号上的字符串,然后另外用空格拆分偶数成员。
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
String splitQuotes [] = keyword.split("\"");
List<String> keywords = new ArrayList<>();
for (int i = 0; i < splitQuotes.length; i++) {
if (i % 2 == 0) {
Collections.addAll(keywords, splitQuotes[i].split(" "));
} else {
keywords.add(splitQuotes[i]);
}
}
List<String> parts = new ArrayList<>();
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
// first use a matcher to grab the quoted terms
Pattern p = Pattern.compile("\"(.*?)\"");
Matcher m = p.matcher(keyword);
while (m.find()) {
parts.add(m.group(1));
}
// then remove all quoted terms (quotes included)
keyword = keyword.replaceAll("\".*?\"", "")
.trim();
// finally split the remaining keywords on whitespace
if (keyword.replaceAll("\s", "").length() > 0) {
Collections.addAll(parts, keyword.split("\s+"));
}
for (String part : parts) {
System.out.println(part);
}
输出:
jack fruit
ice cream
apple
mango
我需要用 space 分割单词 java,所以我使用了 .split
函数来实现,如下所示
String keyword = "apple mango ";
String keywords [] = keyword .split(" ");
上面的代码工作正常,但唯一的问题是我的关键字有时会包含关键字 "jack fruit" , "ice cream" 双引号如下所示
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
在这种情况下,我需要得到 4 个词,例如 apple、mango、jack fruit, ice cream in keywords array
谁能告诉我一些解决方案
此解决方案有效,但我确信这不是性能/资源的最佳选择。当你有超过两个单词的水果时,它也有效。欢迎编辑或优化我的代码。
public static void main(String[] args) {
String keyword = "apple mango \"jack fruit\" \"ice cream\" \"one two three\"";
String[] split = custom_split(keyword);
for (String s : split) {
System.out.println(s);
}
}
private static String[] custom_split(String keyword) {
String[] split = keyword.split(" ");
ArrayList<String> list = new ArrayList<>();
StringBuilder temp = new StringBuilder();
boolean multiple = false;
for (String s : split) {
if (s.startsWith("\"")) {
multiple = true;
s = s.replaceAll("\"", "");
temp.append(s);
continue;
}
if (s.endsWith("\"")) {
multiple = false;
s = s.replaceAll("\"", "");
temp.append(" ").append(s);
list.add(temp.toString());
temp = new StringBuilder();
continue;
}
if (multiple) {
temp.append(" ").append(s);
} else {
list.add(s);
}
}
String[] result = new String[list.size()];
for (int i = 0; i < list.size(); i++) {
result[i] = list.get(i);
}
return result;
}
你不能用 String.split()
做到这一点。您需要为目标标记提出一个正则表达式,并通过匹配器收集它们,如下所示:
final Pattern token = Pattern.compile( "[^\"\s]+|\"[^\"]*\"" );
List<String> tokens = new ArrayList<>();
Matcher m = token.matcher( "apple mango \"jack fruit\" \"ice cream\"" );
while( m.find() )
tokens.add( m.group() );
我会用一个正则表达式和两个捕获组来完成,每个模式一个。我不知道任何其他方式。
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
Pattern p = Pattern.compile("\"?(\w+\W+\w+)\"|(\w+)");
Matcher m = p.matcher(keyword);
while (m.find()) {
String word = m.group(1) == null ? m.group(2) : m.group(1);
System.out.println(word);
}
这将拆分引号上的字符串,然后另外用空格拆分偶数成员。
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
String splitQuotes [] = keyword.split("\"");
List<String> keywords = new ArrayList<>();
for (int i = 0; i < splitQuotes.length; i++) {
if (i % 2 == 0) {
Collections.addAll(keywords, splitQuotes[i].split(" "));
} else {
keywords.add(splitQuotes[i]);
}
}
List<String> parts = new ArrayList<>();
String keyword = "apple mango \"jack fruit\" \"ice cream\"";
// first use a matcher to grab the quoted terms
Pattern p = Pattern.compile("\"(.*?)\"");
Matcher m = p.matcher(keyword);
while (m.find()) {
parts.add(m.group(1));
}
// then remove all quoted terms (quotes included)
keyword = keyword.replaceAll("\".*?\"", "")
.trim();
// finally split the remaining keywords on whitespace
if (keyword.replaceAll("\s", "").length() > 0) {
Collections.addAll(parts, keyword.split("\s+"));
}
for (String part : parts) {
System.out.println(part);
}
输出:
jack fruit
ice cream
apple
mango