Java 中的子字符串操作 - 查找由其他单词构成的最长单词
Substring manipulation in Java - Find the longest word made from the other words
我需要从一个文件中读取内容并找到最长的单词,该单词可以由文件中存在的其他单词组成。文件中的单词是 space 分隔的。例如:
来自文件的输入:
This is example an anexample Thisisanexample Thisistheexample
输出:
Thisisanexample
注意:形成的最长单词是Thisisanexample
而不是Thisistheexample
,因为单词the
没有作为单独的单词包含在文件。
这可以通过使用简单的数组来实现吗?我做了以下事情:
try{
File file = new File(args[0]); //command line argument for file path
br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line = null;
//array for each word
String[] words = new String[] {};
while ((line = br.readLine()) != null){
words = line.split("\s+"); //splitting the string with spaces
}
// array to store length of each word
int[] wordLength = new int[words.length];
for(int i = 0; i < words.length; i++){
wordLength[i] = words[i].length();
}
int currLength = 0; //store length of current word
int maxLength = 0; //store length of max word
String maxWord = null;
//checking each word with others at O(n*n) complexity
for (int i = 0; i < words.length; i++){
currLength = 0;
for (int j = 0; j < words.length && j != i; j++){
if (words[i].contains(words[j])){
currLength += wordLength[j];
}
}
System.out.println(currLength);
if(currLength > maxLength){
maxLength = currLength;
maxWord = words[i];
}
}
System.out.println(maxWord);
}
但是如果有一个子串与一个子串,这就不起作用了。对于以下输入,它将给出错误的输出:
This is example an anexample Thisisanexample Thisisanexample2
输出应该是 Thisisanexample
但它给出了 Thisisanexample2
.
在其他 Stack Overflow 线程的帮助下,我仅通过使用数组就设法做到了这一点。
解决方法如下:
import java.io.*;
import java.util.*;
public class LongestWord implements Comparator<String>{
//compare function to be used for sorting the array according to word length
public int compare(String s1, String s2) {
if (s1.length() < s2.length())
return 1;
else if (s1.length() > s2.length())
return -1;
else
return 0;
}
public static void main(String[] args){
BufferedReader br = null;
try{
File file = new File(args[0]);
br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line = null;
//array for each word
String[] words = new String[] {};
while ((line = br.readLine()) != null){
words = line.split("\s+"); //splitting the string with spaces
}
//sort the array according to length of words in descending order
Arrays.sort(words, new LongestWord());
/* start with the longest word in the array and check if the other words are its substring.
* If substring, then remove that part from the superstring.
* Finally, if the superstring length is 0, then it is the longest word that can be formed.*/
for (String superString: words) {
String current = new String(superString); // to store a copy of the current superstring as we will remove parts of the actual superstring
for (String subString: words) {
if (!subString.equals(current) && superString.contains(subString)) { // superstring contains substring
superString = superString.replace(subString, ""); // remove the substring part from the superstring
}
}
if (superString.length() == 0){
System.out.println(current);
break; // since the array is sorted, the first word that returns length 0 is the longest word formed
}
}
}
catch(FileNotFoundException fex){
System.out.println("File not found");
return;
}
catch(IOException e){
e.printStackTrace();
}
finally{
try {
if (br != null){
br.close();
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
只需几行代码,您就可以使用正则表达式找到候选 "combination" 个单词,然后用简单的逻辑找到最长的匹配:
String longest = "";
Matcher m = Pattern.compile("(?i)\b(this|is|an|example)+\b").matcher(input);
while (m.find())
if ( m.group().length() > longest.length())
longest = m.group();
除了从文件中读取代码并将字符串分配给变量 input
之外,这就是您需要的所有代码。
我需要从一个文件中读取内容并找到最长的单词,该单词可以由文件中存在的其他单词组成。文件中的单词是 space 分隔的。例如:
来自文件的输入:
This is example an anexample Thisisanexample Thisistheexample
输出:
Thisisanexample
注意:形成的最长单词是Thisisanexample
而不是Thisistheexample
,因为单词the
没有作为单独的单词包含在文件。
这可以通过使用简单的数组来实现吗?我做了以下事情:
try{
File file = new File(args[0]); //command line argument for file path
br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line = null;
//array for each word
String[] words = new String[] {};
while ((line = br.readLine()) != null){
words = line.split("\s+"); //splitting the string with spaces
}
// array to store length of each word
int[] wordLength = new int[words.length];
for(int i = 0; i < words.length; i++){
wordLength[i] = words[i].length();
}
int currLength = 0; //store length of current word
int maxLength = 0; //store length of max word
String maxWord = null;
//checking each word with others at O(n*n) complexity
for (int i = 0; i < words.length; i++){
currLength = 0;
for (int j = 0; j < words.length && j != i; j++){
if (words[i].contains(words[j])){
currLength += wordLength[j];
}
}
System.out.println(currLength);
if(currLength > maxLength){
maxLength = currLength;
maxWord = words[i];
}
}
System.out.println(maxWord);
}
但是如果有一个子串与一个子串,这就不起作用了。对于以下输入,它将给出错误的输出:
This is example an anexample Thisisanexample Thisisanexample2
输出应该是 Thisisanexample
但它给出了 Thisisanexample2
.
在其他 Stack Overflow 线程的帮助下,我仅通过使用数组就设法做到了这一点。
解决方法如下:
import java.io.*;
import java.util.*;
public class LongestWord implements Comparator<String>{
//compare function to be used for sorting the array according to word length
public int compare(String s1, String s2) {
if (s1.length() < s2.length())
return 1;
else if (s1.length() > s2.length())
return -1;
else
return 0;
}
public static void main(String[] args){
BufferedReader br = null;
try{
File file = new File(args[0]);
br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line = null;
//array for each word
String[] words = new String[] {};
while ((line = br.readLine()) != null){
words = line.split("\s+"); //splitting the string with spaces
}
//sort the array according to length of words in descending order
Arrays.sort(words, new LongestWord());
/* start with the longest word in the array and check if the other words are its substring.
* If substring, then remove that part from the superstring.
* Finally, if the superstring length is 0, then it is the longest word that can be formed.*/
for (String superString: words) {
String current = new String(superString); // to store a copy of the current superstring as we will remove parts of the actual superstring
for (String subString: words) {
if (!subString.equals(current) && superString.contains(subString)) { // superstring contains substring
superString = superString.replace(subString, ""); // remove the substring part from the superstring
}
}
if (superString.length() == 0){
System.out.println(current);
break; // since the array is sorted, the first word that returns length 0 is the longest word formed
}
}
}
catch(FileNotFoundException fex){
System.out.println("File not found");
return;
}
catch(IOException e){
e.printStackTrace();
}
finally{
try {
if (br != null){
br.close();
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
只需几行代码,您就可以使用正则表达式找到候选 "combination" 个单词,然后用简单的逻辑找到最长的匹配:
String longest = "";
Matcher m = Pattern.compile("(?i)\b(this|is|an|example)+\b").matcher(input);
while (m.find())
if ( m.group().length() > longest.length())
longest = m.group();
除了从文件中读取代码并将字符串分配给变量 input
之外,这就是您需要的所有代码。