尝试读取 BSON 文件,得到 bson.errors.InvalidBSON: objsize too large

Trying to read BSON file, get bson.errors.InvalidBSON: objsize too large

我目前正在尝试读取 bson 文件以将其导入数据库。我已经可以读取文件并将其打印为字节,但我最终只是收到 bson.errors.InvalidBSON: objsize too large 错误。

这是试图解码文件的代码

with zip.open(name) as myfile:
    content = myfile.read()
    print(content)
    print(bson.decode_all(content))

这是我得到的输出

b'[{"_id": {"$oid": "5bf3cf511c9d44000088c376"}, "some": "sort of"}, {"_id": {"$oid": "5bf3cf5c1c9d44000088c377"}, "test": "data"}]'
Traceback (most recent call last):
  File "/home/jonas/.envs/mongodb-backup-py-aGZYxULQ/bin/mongo-backup", line 11, in <module>
    load_entry_point('mongo-backup-cli', 'console_scripts', 'mongo-backup')()
  File "/home/jonas/.envs/mongodb-backup-py-aGZYxULQ/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/jonas/.envs/mongodb-backup-py-aGZYxULQ/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/jonas/.envs/mongodb-backup-py-aGZYxULQ/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/jonas/.envs/mongodb-backup-py-aGZYxULQ/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jonas/.envs/mongodb-backup-py-aGZYxULQ/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/jonas/work/mongodb-backup-py/mongo_backup/cli.py", line 45, in restore
    restore_files(uri, db_name, file.name)
  File "/home/jonas/work/mongodb-backup-py/mongo_backup/restore.py", line 6, in restore_files
    print(read_zip_file(file))
  File "/home/jonas/work/mongodb-backup-py/mongo_backup/zip.py", line 24, in read_zip_file
    print(bson.decode_all(content))
bson.errors.InvalidBSON: objsize too large
print(content)

b'[{"_id": {"$oid": "5bf3cf511c9d44000088c376"}, "some": "sort of"}, {"_id": {"$oid": "5bf3cf5c1c9d44000088c377"}, "test": "data"}]'

content 变量中的字节是 json 编码的 bson,而不是普通的 bson

如果这是您打算继续使用的输出格式,您需要更改代码以使用 bson 的 JSON 实用程序将字符串加载到 python 对象中:

with zip.open(name) as myfile:
    content = myfile.read()
    print(content)
    print(bson.json_util.loads(content))
    #         ^--------------^
    #                | this stuff