将 PDF 文件存储在我的 MongoDB 数据库中,出现 PYmongo 错误

Store a PDF file in my MongoDB database with PYmongo error

我想使用 PYMonbgo 和 gridfs 在我的 MongoDB 数据库(在 Ubuntu 中)存储一个 PDF 文件。 但我收到错误 'utf-8' codec can't decode byte 0xe2 in position 10: invalid continuation byte

如何在 MongoDB 中存储和接收带有 python 的 PDF?

from pymongo import MongoClient
import gridfs

db = MongoClient('mongodb://localhost:27017/').myDB
fs = gridfs.GridFS( db )
fileID = fs.put( open(('Test.pdf')  ))
out = fs.get(fileID)

您需要在阅读后对 PDF 进行适当的编码。我不会假装了解细节。但我已经开始工作了。试试这个,看看它是否也适合你。 (仅供参考,可能还想指定集合)

import base64
import gridfs

def write_new_pdf(path):
    db = MongoClient('mongodb://localhost:27017/').myDB
    fs = gridfs.GridFS(db)
    # Note, open with the "rb" flag for "read bytes"
    with open(path, "rb") as f:
        encoded_string = base64.b64encode(f.read())
    with fs.new_file(
        chunkSize=800000,
        filename=path) as fp:
        fp.write(encoded_string)

更新:如何回读pdf

def read_pdf(filename):
    # Usual setup
    db = MongoClient('mongodb://localhost:27017/').myDB
    fs = gridfs.GridFS(db)
    # Standard query to Mongo
    data = fs.find_one(filter=dict(filename=filename))
    with open(filename, "wb") as f:
        f.write(base64.b64decode(data.read()))