如何从字符串中获取特定的提取物?
How to get an specific extract from a String?
我想从字符串中提取一个片段。摘录应包含关键字前面的 2 个词和关键字后面的 2 个词。如果这2个词不存在,这句话就应该结束。
示例:
我要找的词是"example"。
现有字符串:
String text1 = "This is an example.";
String text2 = "This is another example, but this time the sentence is longer";
结果:
text1
应该是这样的:
is an example.
text2
应该是这样的:
is another example, but this
我该怎么做?
尝试使用模式:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String text1 = "This is an example.";
String text2 = "This is another example, but this time the sentence is longer";
String key = "example";
String regex = "((\w+\s){2})?" + key +"([,](\s\w+){0,2})?";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text1);
matcher.find();
System.out.println(matcher.group(0));
matcher = pattern.matcher(text2);
matcher.find();
System.out.println(matcher.group(0));
}
}
输出:
is an example
is another example, but this
也许你需要稍微改变一下正则表达式,但你可以试试这个。
使用replaceAll()
,一行即可完成:
String target = text1.replaceAll(".*?((\w+\W+){2})(example)((\W+\w+){2})?.*", "");
仅供参考,\w
表示"word character",\W
表示"non word character"
我想从字符串中提取一个片段。摘录应包含关键字前面的 2 个词和关键字后面的 2 个词。如果这2个词不存在,这句话就应该结束。
示例:
我要找的词是"example"。
现有字符串:
String text1 = "This is an example.";
String text2 = "This is another example, but this time the sentence is longer";
结果:
text1
应该是这样的:
is an example.
text2
应该是这样的:
is another example, but this
我该怎么做?
尝试使用模式:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String text1 = "This is an example.";
String text2 = "This is another example, but this time the sentence is longer";
String key = "example";
String regex = "((\w+\s){2})?" + key +"([,](\s\w+){0,2})?";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text1);
matcher.find();
System.out.println(matcher.group(0));
matcher = pattern.matcher(text2);
matcher.find();
System.out.println(matcher.group(0));
}
}
输出:
is an example
is another example, but this
也许你需要稍微改变一下正则表达式,但你可以试试这个。
使用replaceAll()
,一行即可完成:
String target = text1.replaceAll(".*?((\w+\W+){2})(example)((\W+\w+){2})?.*", "");
仅供参考,\w
表示"word character",\W
表示"non word character"