为什么 Groovy eachDir() 每次都给我相同的排序?

Why Does Groovy eachDir() Give Me The Same Sort Every Time?

我正在创建一个包含子目录列表的文件

task createNotes {
  doLast {
    def myFile = new File("my-notes.txt")
    def file = new File("src/test/")
    println myFile.exists()
    myFile.delete()
    println myFile.exists()
    println myFile.absolutePath
    println file.absolutePath
    myFile.withWriter {
      out ->
        file.eachDir { dir ->
          out.println dir.getName()
        }
    }
  }
}

显然无法保证排序顺序,但每次我 运行 它都会得到相同的排序顺序:

soft
java
calc
conc
caab
pres

如果我将“soft”目录更改为“sofp”,则输出为:

java
sofp
calc
conc
caab
pres

当我改回名称时,它会转到原来的顺序。

它似乎没有按任何特定顺序排序 - 这与文档中所说的无法保证顺序相符,但如果是这样,为什么每次总是给我相同的排序?

我们分解一下,先看看eachDir Groovy扩展方法的实现:

public static void eachDir(File self, @ClosureParams(value = SimpleType.class, options = "java.io.File") Closure closure) throws FileNotFoundException, IllegalArgumentException {
    eachFile(self, FileType.DIRECTORIES, closure);
}

eachFile 是做什么的?

public static void eachFile(final File self, final FileType fileType, @ClosureParams(value = SimpleType.class, options = "java.io.File") final Closure closure)
        throws FileNotFoundException, IllegalArgumentException {
    checkDir(self);
    final File[] files = self.listFiles();
    // null check because of http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4803836
    if (files == null) return;
    for (File file : files) {
        if (fileType == FileType.ANY ||
                (fileType != FileType.FILES && file.isDirectory()) ||
                (fileType != FileType.DIRECTORIES && file.isFile())) {
            closure.call(file);
        }
    }
}

好的,所以 Groovy 只是在幕后调用 Java 的 File#listFiles method,然后在不干扰现有顺序的情况下迭代结果。

转到 OpenJDK 实现,我们可以看到 Files#listFiles uses FileSystem#list via the normalizedList 方法。

FileSystem#list is abstract. Continuing to the two most popular implementations it turns out that both UnixFileSystem#list and Win32FileSystem#list 有一个 native 实现:

@Override
public native String[] list(File f);

Windows

深入研究Windows implementation

JNIEXPORT jobjectArray JNICALL
Java_java_io_WinNTFileSystem_list(JNIEnv *env, jobject this, jobject file)
{
    WCHAR *search_path;
    HANDLE handle;
    WIN32_FIND_DATAW find_data;
    int len, maxlen;
    jobjectArray rv, old;
    DWORD fattr;
    jstring name;
    jclass str_class;
    WCHAR *pathbuf;
    DWORD err;

    str_class = JNU_ClassString(env);
    CHECK_NULL_RETURN(str_class, NULL);

    pathbuf = fileToNTPath(env, file, ids.path);
    if (pathbuf == NULL)
        return NULL;
    search_path = (WCHAR*)malloc(2*wcslen(pathbuf) + 6);
    if (search_path == 0) {
        free (pathbuf);
        errno = ENOMEM;
        JNU_ThrowOutOfMemoryError(env, "native memory allocation failed");
        return NULL;
    }
    wcscpy(search_path, pathbuf);
    free(pathbuf);
    fattr = GetFileAttributesW(search_path);
    if (fattr == INVALID_FILE_ATTRIBUTES) {
        free(search_path);
        return NULL;
    } else if ((fattr & FILE_ATTRIBUTE_DIRECTORY) == 0) {
        free(search_path);
        return NULL;
    }

    /* Remove trailing space chars from directory name */
    len = (int)wcslen(search_path);
    while (search_path[len-1] == L' ') {
        len--;
    }
    search_path[len] = 0;

    /* Append "*", or possibly "\*", to path */
    if ((search_path[0] == L'\' && search_path[1] == L'[=13=]') ||
        (search_path[1] == L':'
        && (search_path[2] == L'[=13=]'
        || (search_path[2] == L'\' && search_path[3] == L'[=13=]')))) {
        /* No '\' needed for cases like "\" or "Z:" or "Z:\" */
        wcscat(search_path, L"*");
    } else {
        wcscat(search_path, L"\*");
    }

    /* Open handle to the first file */
    handle = FindFirstFileW(search_path, &find_data);
    free(search_path);
    if (handle == INVALID_HANDLE_VALUE) {
        if (GetLastError() != ERROR_FILE_NOT_FOUND) {
            // error
            return NULL;
        } else {
            // No files found - return an empty array
            rv = (*env)->NewObjectArray(env, 0, str_class, NULL);
            return rv;
        }
    }

    /* Allocate an initial String array */
    len = 0;
    maxlen = 16;
    rv = (*env)->NewObjectArray(env, maxlen, str_class, NULL);
    if (rv == NULL) { // Couldn't allocate an array
        FindClose(handle);
        return NULL;
    }
    /* Scan the directory */
    do {
        if (!wcscmp(find_data.cFileName, L".")
                                || !wcscmp(find_data.cFileName, L".."))
           continue;
        name = (*env)->NewString(env, find_data.cFileName,
                                 (jsize)wcslen(find_data.cFileName));
        if (name == NULL) {
            FindClose(handle);
            return NULL; // error
        }
        if (len == maxlen) {
            old = rv;
            rv = (*env)->NewObjectArray(env, maxlen <<= 1, str_class, NULL);
            if (rv == NULL || JNU_CopyObjectArray(env, rv, old, len) < 0) {
                FindClose(handle);
                return NULL; // error
            }
            (*env)->DeleteLocalRef(env, old);
        }
        (*env)->SetObjectArrayElement(env, rv, len++, name);
        (*env)->DeleteLocalRef(env, name);

    } while (FindNextFileW(handle, &find_data));

    err = GetLastError();
    FindClose(handle);
    if (err != ERROR_NO_MORE_FILES) {
        return NULL; // error
    }

    if (len < maxlen) {
        /* Copy the final results into an appropriately-sized array */
        old = rv;
        rv = (*env)->NewObjectArray(env, len, str_class, NULL);
        if (rv == NULL)
            return NULL; /* error */
        if (JNU_CopyObjectArray(env, rv, old, len) < 0)
            return NULL; /* error */
    }
    return rv;
}

我们可以看到用于迭代文件的 FindFirstFileWFindNextFileWFindClose WinAPI 函数的组合。关于订购的摘录 the documentation of FindNextFileW:

The order in which the search returns the files, such as alphabetical order, is not guaranteed, and is dependent on the file system.

(...)

The order in which this function returns the file names is dependent on the file system type. With the NTFS file system and CDFS file systems, the names are usually returned in alphabetical order. With FAT file systems, the names are usually returned in the order the files were written to the disk, which may or may not be in alphabetical order. However, as stated previously, these behaviors are not guaranteed.

因此,在给定 OS 和文件系统类型约束的情况下,实现以一种尝试最佳的方式列出文件。不保证特定订单。

*nix

*nix 系统呢? Here's the code:

JNIEXPORT jobjectArray JNICALL
Java_java_io_UnixFileSystem_list(JNIEnv *env, jobject this,
                                 jobject file)
{
    DIR *dir = NULL;
    struct dirent *ptr;
    int len, maxlen;
    jobjectArray rv, old;
    jclass str_class;

    str_class = JNU_ClassString(env);
    CHECK_NULL_RETURN(str_class, NULL);

    WITH_FIELD_PLATFORM_STRING(env, file, ids.path, path) {
        dir = opendir(path);
    } END_PLATFORM_STRING(env, path);
    if (dir == NULL) return NULL;

    /* Allocate an initial String array */
    len = 0;
    maxlen = 16;
    rv = (*env)->NewObjectArray(env, maxlen, str_class, NULL);
    if (rv == NULL) goto error;

    /* Scan the directory */
    while ((ptr = readdir(dir)) != NULL) {
        jstring name;
        if (!strcmp(ptr->d_name, ".") || !strcmp(ptr->d_name, ".."))
            continue;
        if (len == maxlen) {
            old = rv;
            rv = (*env)->NewObjectArray(env, maxlen <<= 1, str_class, NULL);
            if (rv == NULL) goto error;
            if (JNU_CopyObjectArray(env, rv, old, len) < 0) goto error;
            (*env)->DeleteLocalRef(env, old);
        }
#ifdef MACOSX
        name = newStringPlatform(env, ptr->d_name);
#else
        name = JNU_NewStringPlatform(env, ptr->d_name);
#endif
        if (name == NULL) goto error;
        (*env)->SetObjectArrayElement(env, rv, len++, name);
        (*env)->DeleteLocalRef(env, name);
    }
    closedir(dir);

    /* Copy the final results into an appropriately-sized array */
    if (len < maxlen) {
        old = rv;
        rv = (*env)->NewObjectArray(env, len, str_class, NULL);
        if (rv == NULL) {
            return NULL;
        }
        if (JNU_CopyObjectArray(env, rv, old, len) < 0) {
            return NULL;
        }
    }
    return rv;

 error:
    closedir(dir);
    return NULL;
}

本次迭代由opendir/readdir/closedir三重奏支持。 POSIX documentation of readdir 只提到这个关于排序:

The type DIR, which is defined in the header <dirent.h>, represents a directory stream, which is an ordered sequence of all the directory entries in a particular directory.

Linux documentation还有话要说:

The order in which filenames are read by successive calls to readdir() depends on the filesystem implementation; it is unlikely that the names will be sorted in any fashion.

足够接近 Windows 除了有 一些 订单之外没有订单保证。

总结

'is not guaranteed' 的意思是一个特定的功能是一个实现细节,你不应该依赖它(不像 'guaranteed' 功能,由于向后兼容性承诺保持不变一些,更长的时间段)。这些特性可能因环境(例如 JVM 实现、操作系统、产品版本)甚至特定调用而异。只要供应商愿意,它们就会保持不变。

因此,在这种特殊情况下,如果您希望文件有任何特定顺序,请先对它们进行排序。即使这意味着订单将保持不变。