运行 对一个目录中的多个文件执行 java 程序,输出具有唯一名称

run a java program on multiple files in a directory, output with unique names

我的目录结构如下:

base_directory / level_one_a, level_one_b, level_one_c /

然后在 level_one_x 中的所有这些目录中有许多后续目录,即

/level_one_a_1,level_one_a_2,level_one_a_3...

等等 level_one_b & level_one_c

然后在 level_one_a_1 里面我们还有更多,即 level_one_a_1_I,level_one_a_1_II,level_one_a_1_III,level_one_a_1_IV...

然后最后在level_one_a_1_IV里面,和所有在同一层的,都是我要操作的文件。

我想更简短的说法是 start/one/two/three/*files*

有很多文件,我想用我写的一个简单的 java 程序来处理它们:

    try 
    {
        StringBuilder sb = new StringBuilder();
        String line = br.readLine();

        while (line != null) 
        {

            sb.append(line);
            sb.append(System.lineSeparator());
            line = br.readLine();
        }
        String everything = sb.toString();



        Document doc = Jsoup.parse(everything);
        String link = doc.select("block.full_text").text();
        System.out.println(link);


    }
    finally 
    {
        br.close();
    }

它使用jsoup

我想构建这个脚本,这样程序就可以自主导航这个目录结构并抓取每个文件,然后用那个脚本处理它,使用缓冲 reader 和文件 reader 我猜,我怎样才能促进呢?我尝试实施 this solution 但我无法让它工作。

理想情况下,我想用唯一的名称输出它处理的每个文件,即文件是否命名为 00001.txt 它可能将其保存为 00001_output.txt 但是,那是一匹不同颜色的马

只需使用 java.io.File 及其方法 listFiles。 请参阅 javadoc File API

此处发布了关于 SO 的类似问题: Recursively list files in Java

您也可以使用 Java NIO 2 API。

public class ProcessFiles extends SimpleFileVisitor<Path> {

    static final String OUT_FORMAT = "%-17s: %s%n";
    static final int MAX_DEPTH = 4;
    static final Path baseDirectory = Paths.get("R:/base_directory");

    public static void main(String[] args) throws IOException {
        Set<FileVisitOption> visitOptions = new HashSet<>();
        visitOptions.add(FileVisitOption.FOLLOW_LINKS);
        Files.walkFileTree(baseDirectory, visitOptions, MAX_DEPTH,
                new ProcessFiles()
        );
    }

    @Override
    public FileVisitResult visitFile(Path file, BasicFileAttributes attr) {
        if (file.getNameCount() <= MAX_DEPTH) {
            System.out.printf(OUT_FORMAT, "skip wrong level", file);
            return FileVisitResult.SKIP_SUBTREE;
        } else {
            // add probably a file name check
            System.out.printf(OUT_FORMAT, "process file", file);
            return CONTINUE;
        }
    }

    @Override
    public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attr) {
        if (dir.getNameCount() < MAX_DEPTH) {
            System.out.printf(OUT_FORMAT, "walk into dir", dir);
            return CONTINUE;
        }
        if (dir.getName(MAX_DEPTH - 1).toString().equals("level_one_a_1_IV")) {
            System.out.printf(OUT_FORMAT, "destination dir", dir);
            return CONTINUE;
        } else {
            System.out.printf(OUT_FORMAT, "skip dir name", dir);
            return FileVisitResult.SKIP_SUBTREE;
        }
    }
}

假设以下 directory/file 结构

base_directory
base_directory/base_directory.file
base_directory/level_one_a
base_directory/level_one_a/level_one_a.file
base_directory/level_one_a/level_one_a_1
base_directory/level_one_a/level_one_a_1/level_one_a_1.file
base_directory/level_one_a/level_one_a_1/level_one_a_1_I
base_directory/level_one_a/level_one_a_1/level_one_a_1_I/level_one_a_1_I.file
base_directory/level_one_a/level_one_a_1/level_one_a_1_II
base_directory/level_one_a/level_one_a_1/level_one_a_1_II/level_one_a_1_II.file
base_directory/level_one_a/level_one_a_1/level_one_a_1_III
base_directory/level_one_a/level_one_a_1/level_one_a_1_III/level_one_a_1_III.file
base_directory/level_one_a/level_one_a_1/level_one_a_1_IV
base_directory/level_one_a/level_one_a_1/level_one_a_1_IV/level_one_a_1_IV.file
base_directory/someother_a
base_directory/someother_a/someother_a.file
base_directory/someother_a/someother_a_1
base_directory/someother_a/someother_a_1/someother_a_1.file
base_directory/someother_a/someother_a_1/someother_a_1_I
base_directory/someother_a/someother_a_1/someother_a_1_I/someother_a_1_I.file
base_directory/someother_a/someother_a_1/someother_a_1_II
base_directory/someother_a/someother_a_1/someother_a_1_II/someother_a_1_II.file
base_directory/someother_a/someother_a_1/someother_a_1_III
base_directory/someother_a/someother_a_1/someother_a_1_III/someother_a_1_III.file
base_directory/someother_a/someother_a_1/someother_a_1_IV
base_directory/someother_a/someother_a_1/someother_a_1_IV/someother_a_1_IV.file

您将得到以下输出(用于演示)

walk into dir    : R:\base_directory
skip wrong level : R:\base_directory\base_directory.file
walk into dir    : R:\base_directory\level_one_a
skip wrong level : R:\base_directory\level_one_a\level_one_a.file
walk into dir    : R:\base_directory\level_one_a\level_one_a_1
skip wrong level : R:\base_directory\level_one_a\level_one_a_1\level_one_a_1.file
skip dir name    : R:\base_directory\level_one_a\level_one_a_1\level_one_a_1_I
skip dir name    : R:\base_directory\level_one_a\level_one_a_1\level_one_a_1_II
skip dir name    : R:\base_directory\level_one_a\level_one_a_1\level_one_a_1_III
destination dir  : R:\base_directory\level_one_a\level_one_a_1\level_one_a_1_IV
process file     : R:\base_directory\level_one_a\level_one_a_1\level_one_a_1_IV\level_one_a_1_IV.file
walk into dir    : R:\base_directory\someother_a
skip wrong level : R:\base_directory\someother_a\someother_a.file
walk into dir    : R:\base_directory\someother_a\someother_a_1
skip wrong level : R:\base_directory\someother_a\someother_a_1\someother_a_1.file
skip dir name    : R:\base_directory\someother_a\someother_a_1\someother_a_1_I
skip dir name    : R:\base_directory\someother_a\someother_a_1\someother_a_1_II
skip dir name    : R:\base_directory\someother_a\someother_a_1\someother_a_1_III
skip dir name    : R:\base_directory\someother_a\someother_a_1\someother_a_1_IV

一些指向 Oralce 教程的链接以供进一步阅读
Walking the File Tree
Finding Files