如何在Python中生成DOCX并保存在内存中?

How to generate a DOCX in Python and save it in memory?

我的任务是从模板生成 DOCX 文件,然后通过 Flask 提供它。我使用 python-docx-templates,它只是 python-docx 的包装器,允许使用 jinja 模板。

最后他们建议使用 StringIO 只在内存中保存文件,所以我的代码是这样的:

def report_doc(user_id):
    # Prepare the data...

    from docxtpl import DocxTemplate

    doc = DocxTemplate(app.root_path+'/templates/report.docx')
    doc.render({
        # Pass parameters
    })
    from io import StringIO
    file_stream = StringIO()
    doc.save(file_stream)

    return send_file(file_stream, as_attachment=True, attachment_filename='report_'+user_id+'.docx')

保存时会抛出错误 TypeError: string argument expected, got 'bytes'。谷歌搜索后,我发现 this answerZipFile 需要 BytesIO。但是,当我用 BytesIO 替换 StringIO 时,它只返回一个空文件,所以它不会抛出任何错误,但肯定不会保存文件。

在这种情况下,究竟什么会起作用?如果这里有什么地方完全不对,一般情况下这怎么行得通?

谢谢!

UPD:这是对 save 函数调用的完整跟踪的异常:

File "/ms/controllers.py", line 1306, in report_doc
    doc.save(file_stream)
  File "/.env/lib/python3.5/site-packages/docx/document.py", line 142, in save
    self._part.save(path_or_stream)
  File "/.env/lib/python3.5/site-packages/docx/parts/document.py", line 129, in save
    self.package.save(path_or_stream)
  File "/.env/lib/python3.5/site-packages/docx/opc/package.py", line 160, in save
    PackageWriter.write(pkg_file, self.rels, self.parts)
  File "/.env/lib/python3.5/site-packages/docx/opc/pkgwriter.py", line 33, in write
    PackageWriter._write_content_types_stream(phys_writer, parts)
  File "/.env/lib/python3.5/site-packages/docx/opc/pkgwriter.py", line 45, in _write_content_types_stream
    phys_writer.write(CONTENT_TYPES_URI, cti.blob)
  File "/.env/lib/python3.5/site-packages/docx/opc/phys_pkg.py", line 155, in write
    self._zipf.writestr(pack_uri.membername, blob)
  File "/usr/lib/python3.5/zipfile.py", line 1581, in writestr
    self.fp.write(zinfo.FileHeader(zip64))
TypeError: string argument expected, got 'bytes'

使用 BytesIO 实例是正确的,但在将其传递给 send_file 之前需要 rewind the file pointer:

Make sure that the file pointer is positioned at the start of data to send before calling send_file().

所以这应该有效:

import io
from docxtpl import DocxTemplate

def report_doc(user_id):
   # Prepare the data...

   doc = DocxTemplate(app.root_path+'/templates/report.docx')
   doc.render({
        # Pass parameters
   })

   # Create in-memory buffer
   file_stream = io.BytesIO()
   # Save the .docx to the buffer
   doc.save(file_stream)
   # Reset the buffer's file-pointer to the beginning of the file
   file_stream.seek(0)

   return send_file(file_stream, as_attachment=True, attachment_filename='report_'+user_id+'.docx')

(在 Firefox 上测试,我发现浏览器一直从缓存中检索文件,即使我指定了不同的文件名,所以你可能需要在测试时清除浏览器的缓存,或者如果你的浏览器支持,则在开发工具中禁用缓存这个,或者调整 Flask 的 cache control settings).