仅提取 1 个单词的行？

Question

试图只获取其中包含 1 个单词的行。

目前的方法得到正确的结果，但有时输入文件的每个单词之间有超过 4 行。所以需要一种方法来 只获取其中包含 1 个单词的行。 有什么想法吗？

下面是输入文本的示例：

adversary
someone who offers opposition
The students are united by shared suffering, and by a common adversary. 
— New York Times (Nov 10, 2014)
aplomb
great coolness and composure under strain
I wish I had handled it with aplomb. 
— New York Times (May 18, 2014)
apprehensive

所以输出应该是这样的：

adversary
aplomb
apprehensive

目前的代码如下：

import java.io.BufferedReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.nio.file.Files;
import java.nio.file.Paths;

public class Process {

    public static void main(String[] args) {

        String fileNameOutput = "OutputFile.txt";
        String fileName = "InputWords";

        try (BufferedReader bReader = Files.newBufferedReader(Paths.get(fileName))){

            PrintWriter outputStream = new PrintWriter(fileNameOutput); 
            int lineNum = 0;
            String line = null;

            while ( (line = bReader.readLine() ) != null ) {
               lineNum++;

             if ( lineNum % 4 == 0 ) continue;


                outputStream.println(line);


            }
                outputStream.close();

        } catch (IOException e) {
            e.printStackTrace();
        }



    }

}

感谢您的宝贵时间。

编辑

根据以下建议的修复从控制台获取此错误。

java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(Unknown Source)
    at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
    at sun.nio.cs.StreamDecoder.read(Unknown Source)
    at java.io.InputStreamReader.read(Unknown Source)
    at java.io.BufferedReader.fill(Unknown Source)
    at java.io.BufferedReader.readLine(Unknown Source)
    at java.io.BufferedReader.readLine(Unknown Source)
    at Process.main(Process.java:20)

Answer 1

嗯，而不是

if ( lineNum % 4 == 0 ) continue;

条件，你可以简单的检查你刚刚读到的行是否包含多个token :

if (line.split(" ").length > 1) continue;

或

if (line.indexOf(" ") >= 0) continue;

后一种情况应该比前一种更有效率。

Answer 2

而不是

if ( lineNum % 4 == 0 ) continue;

只需检查包含 space.

的行

if(line.trim().contains(" ")) continue;

Answer 3

您在 java.io.BufferedReader.readLine(Unknown Source) 收到一条错误消息，因此未找到输入文件... 尝试更改文件名

String fileName = "InputWords";

to

String fileName = "InputWords.txt";

Answer 4

取决于你对"word"的定义：

一系列字母
非空格字符的序列
表示单词的字形（例如中文）

让我们坚持前两个，并使用正则表达式进行检查，这样我们也可以轻松地忽略前导和尾随空格。以下是三种方式：

if (line.matches("\s*[a-zA-Z]+\s*")) // One or more ASCII letters
    outputStream.println(line);

if (line.matches("\s*\p{L}+\s*")) // One or more Unicode letters
    outputStream.println(line);

if (line.matches("\s*\S+\s*")) // One or more non-space characters
    outputStream.println(line);

至于MalformedInputException，是代码页不匹配导致的（StreamDecoder抛出的异常）。

newBufferedReader(path)以UTF-8读取文件，文件很可能是系统默认代码页，而不是UTF-8。

改用newBufferedReader(path, Charset.defaultCharset())。

Answer 5

工作！！需要添加字符集。

   public static void main(String args[]){
        //testAnimal();
         String fileNameOutput = "OutputFile.txt";
            String fileName = "InputWords.txt";

            Charset cs = Charset.defaultCharset() ;
            try (BufferedReader bReader = Files.newBufferedReader(Paths.get(fileName), cs)){

                PrintWriter outputStream = new PrintWriter(fileNameOutput); 
                int lineNum = 0;
                String line = null;

                while ( (line = bReader.readLine() ) != null ) {
                   lineNum++;

                  if (line.split(" ").length > 1) continue;


                    outputStream.println(line);


                }
                    outputStream.close();

            } catch (IOException e) {
                e.printStackTrace();
            }


   }

仅提取 1 个单词的行？

Extracting lines with only 1 word?

java

string

bufferedreader