计算文章 "a"、"an" 在文本文件中的使用次数
Counting number of time the articles "a","an" are being used in a text file
我正在尝试制作一个程序来计算字数、行数、句子数以及文章数 'a'、'and'、'the'。
到目前为止,我得到了单词、行、句子。但是我不知道我要统计谁的文章。程序如何区分 'a' 和 'and'.
到目前为止这是我的代码。
public static void main(String[]args) throws FileNotFoundException, IOException
{
FileInputStream file= new FileInputStream("C:\Users\nlstudent\Downloads\text.txt");
Scanner sfile = new Scanner(new File("C:\Users\nlstudent\Downloads\text.txt"));
int ch,sentence=0,words = 0,chars = 0,lines = 0;
while((ch=file.read())!=-1)
{
if(ch=='?'||ch=='!'|| ch=='.')
sentence++;
}
while(sfile.hasNextLine()) {
lines++;
String line = sfile.nextLine();
chars += line.length();
words += new StringTokenizer(line, " ,").countTokens();
}
System.out.println("Number of words: " + words);
System.out.println("Number of sentence: " + sentence);
System.out.println("Number of lines: " + lines);
System.out.println("Number of characters: " + chars);
}
}
标记器会将每一行拆分为标记。您可以评估每个标记(一个完整的单词)以查看它是否与您期望的字符串匹配。这是一个计算 a, and, the.
的例子
int a = 0, and = 0, the = 0, forCount = 0;
while (sfile.hasNextLine()) {
lines++;
String line = sfile.nextLine();
chars += line.length();
StringTokenizer tokenizer = new StringTokenizer(line, " ,");
words += tokenizer.countTokens();
while (tokenizer.hasMoreTokens()) {
String element = (String) tokenizer.nextElement();
if ("a".equals(element)) {
a++;
} else if ("and".equals(element)) {
and++;
} else if ("for".equals(element)) {
forCount++;
} else if ("the".equals(element)) {
the++;
}
}
}
How can a program make the difference between 'a' and 'and'.
您可以为此使用正则表达式:
String input = "A and Andy then the are a";
Matcher m = Pattern.compile("(?i)\b((a)|(an)|(and)|(the))\b").matcher(input);
int count = 0;
while(m.find()){
count++;
}
//count == 4
'\b'是一个字边界,'|'是 OR, '(?i)' — 忽略大小写 标志。您可以找到所有模式列表 here 并且您可能应该了解正则表达式。
我正在尝试制作一个程序来计算字数、行数、句子数以及文章数 'a'、'and'、'the'。 到目前为止,我得到了单词、行、句子。但是我不知道我要统计谁的文章。程序如何区分 'a' 和 'and'.
到目前为止这是我的代码。
public static void main(String[]args) throws FileNotFoundException, IOException
{
FileInputStream file= new FileInputStream("C:\Users\nlstudent\Downloads\text.txt");
Scanner sfile = new Scanner(new File("C:\Users\nlstudent\Downloads\text.txt"));
int ch,sentence=0,words = 0,chars = 0,lines = 0;
while((ch=file.read())!=-1)
{
if(ch=='?'||ch=='!'|| ch=='.')
sentence++;
}
while(sfile.hasNextLine()) {
lines++;
String line = sfile.nextLine();
chars += line.length();
words += new StringTokenizer(line, " ,").countTokens();
}
System.out.println("Number of words: " + words);
System.out.println("Number of sentence: " + sentence);
System.out.println("Number of lines: " + lines);
System.out.println("Number of characters: " + chars);
}
}
标记器会将每一行拆分为标记。您可以评估每个标记(一个完整的单词)以查看它是否与您期望的字符串匹配。这是一个计算 a, and, the.
的例子int a = 0, and = 0, the = 0, forCount = 0;
while (sfile.hasNextLine()) {
lines++;
String line = sfile.nextLine();
chars += line.length();
StringTokenizer tokenizer = new StringTokenizer(line, " ,");
words += tokenizer.countTokens();
while (tokenizer.hasMoreTokens()) {
String element = (String) tokenizer.nextElement();
if ("a".equals(element)) {
a++;
} else if ("and".equals(element)) {
and++;
} else if ("for".equals(element)) {
forCount++;
} else if ("the".equals(element)) {
the++;
}
}
}
How can a program make the difference between 'a' and 'and'.
您可以为此使用正则表达式:
String input = "A and Andy then the are a";
Matcher m = Pattern.compile("(?i)\b((a)|(an)|(and)|(the))\b").matcher(input);
int count = 0;
while(m.find()){
count++;
}
//count == 4
'\b'是一个字边界,'|'是 OR, '(?i)' — 忽略大小写 标志。您可以找到所有模式列表 here 并且您可能应该了解正则表达式。