为什么 File.ReadAllText() 也能识别 UTF-16 编码？

Question

我使用

读取文件

File.ReadAllText(..., Encoding.ASCII);

根据the documentation [MSDN] _{（强调我的）},

This method attempts to automatically detect the encoding of a file based on the presence of byte order marks. Encoding formats UTF-8 and UTF-32 (both big-endian and little-endian) can be detected.

然而，在我的例子中，ASCII 文件错误地以 0xFE 0xFF 开头并且它检测到 UTF-16（可能是大端，但我没有检查）。

Answer 1

根据File [referencesource]，它使用 StreamReader：

private static String InternalReadAllText(String path, Encoding encoding, bool checkHost)
{
  ...
  using (StreamReader sr = new StreamReader(path, encoding, true, StreamReader.DefaultBufferSize, checkHost))
    return sr.ReadToEnd();
}

和 that StreamReader overload with 5 parameter [MSDN] 也被记录为支持 UTF-16

It automatically recognizes UTF-8, little-endian Unicode, big-endian Unicode, little-endian UTF-32, and big-endian UTF-32 text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.

_{（强调我的）}

由于 File.ReadAlltext() 应该并记录在案以检测 Unicode BOM，因此它也可以检测 UTF-16 可能是个好主意。但是，文档是错误的，应该更新。我提交了 issue #7515.

为什么 File.ReadAllText() 也能识别 UTF-16 编码？

Why does File.ReadAllText() also recognize UTF-16 encodings?

.net

c#

file