Json 加载标准输入失败

Question

我正在尝试使用 Windows 命令行 python algo.py < number.json 和 json.loads(sys.stdin) 在我的脚本中使用 stdin 加载我的 json 文件，但是它失败了。

但是，我可以用

加载我的 json

with open('number.json',encoding='utf-8-sig') as f:
n = json.loads(f)

使用 json.loads(sys.stdin) 时出现异常：

the JSON object must be str, bytes or bytearray, not TextIOWrapper

使用 json.load(sys.stdin) or json.loads(sys.stdin.read()) 时出现异常：

Expecting value: line 1 column 1 (char 0)

有人遇到同样的问题吗？在寻求帮助之前，我阅读了该论坛中的多篇帖子。

这是 json 文件：

[
  {
    "x": 1,
    "y": 4,
    "z": -1,
    "t": 2
  },
  {
    "x": 2,
    "y": -1,
    "z": 3,
    "t": 0
  }
]

Answer 1

根据您的评论，您的问题似乎是您在文件前添加了 UTF-8 BOM。这意味着额外的三个字节 0xEF 0xBB 0xBF 在您的文件中最先找到。

Python json 模块 documentation 表示它不接受 BOM。因此，您必须在将 JSON 数据传递给 json.load 或 json.loads.

之前将其删除

至少有三种方法可以移除BOM。最好的方法是简单地编辑 JSON 文件以将其删除。如果那不可能，您可以在 Python 代码中跳过它。

如果您的代码只需要处理包含 BOM 的文件，您可以使用：

assert b'\xEF\xBB\xBF' == sys.stdin.buffer.read(3)

这确保删除的字节确实是 UTF-8 BOM。

如果您需要处理可能包含或不包含 BOM 的文件，您可以使用 TextIOWrapper 和正确的编码来包装您的标准输入流，如 this answer 中所述。然后代码如下所示：

import io
stdin_wrapper = io.TextIOWrapper(sys.stdin.buffer, encoding='utf-8-sig')
# use stdin_wrapper instead of stdin

引用 Python Unicode HOWTO 为什么 utf-8-sig:

In some areas, it is also convention to use a “BOM” at the start of UTF-8 encoded files; the name is misleading since UTF-8 is not byte-order dependent. The mark simply announces that the file is encoded in UTF-8. For reading such files, use the ‘utf-8-sig’ codec to automatically skip the mark if present.

Json 加载标准输入失败

Json loading with stdin failure

python

file-io

json

user-input

python-3.x