为什么 File.ReadAllText() 也能识别 UTF-16 编码?
Why does File.ReadAllText() also recognize UTF-16 encodings?
我使用
读取文件
File.ReadAllText(..., Encoding.ASCII);
根据the documentation [MSDN] (强调我的),
This method attempts to automatically detect the encoding of a file based on the presence of byte order marks. Encoding formats UTF-8 and UTF-32 (both big-endian and little-endian) can be detected.
然而,在我的例子中,ASCII 文件错误地以 0xFE 0xFF
开头并且它检测到 UTF-16(可能是大端,但我没有检查)。
根据File
[referencesource],它使用 StreamReader:
private static String InternalReadAllText(String path, Encoding encoding, bool checkHost)
{
...
using (StreamReader sr = new StreamReader(path, encoding, true, StreamReader.DefaultBufferSize, checkHost))
return sr.ReadToEnd();
}
和 that StreamReader overload with 5 parameter [MSDN] 也被记录为支持 UTF-16
It automatically recognizes UTF-8, little-endian Unicode, big-endian Unicode, little-endian UTF-32, and big-endian UTF-32 text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.
(强调我的)
由于 File.ReadAlltext()
应该并记录在案以检测 Unicode BOM,因此它也可以检测 UTF-16 可能是个好主意。但是,文档是错误的,应该更新。我提交了 issue #7515.
我使用
读取文件File.ReadAllText(..., Encoding.ASCII);
根据the documentation [MSDN] (强调我的),
This method attempts to automatically detect the encoding of a file based on the presence of byte order marks. Encoding formats UTF-8 and UTF-32 (both big-endian and little-endian) can be detected.
然而,在我的例子中,ASCII 文件错误地以 0xFE 0xFF
开头并且它检测到 UTF-16(可能是大端,但我没有检查)。
根据File
[referencesource],它使用 StreamReader:
private static String InternalReadAllText(String path, Encoding encoding, bool checkHost)
{
...
using (StreamReader sr = new StreamReader(path, encoding, true, StreamReader.DefaultBufferSize, checkHost))
return sr.ReadToEnd();
}
和 that StreamReader overload with 5 parameter [MSDN] 也被记录为支持 UTF-16
It automatically recognizes UTF-8, little-endian Unicode, big-endian Unicode, little-endian UTF-32, and big-endian UTF-32 text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.
(强调我的)
由于 File.ReadAlltext()
应该并记录在案以检测 Unicode BOM,因此它也可以检测 UTF-16 可能是个好主意。但是,文档是错误的,应该更新。我提交了 issue #7515.