pyPdf 拆分大 PDF 在拆分 PDF 的 150-152 页后失败

Question

我有一个函数将 PDF 文件路径作为输入并将其拆分为单独的页面，如下所示：

import os,time
from pyPdf import PdfFileReader, PdfFileWriter

def split_pages(file_path):
    print("Splitting the PDF")
    temp_path = os.path.join(os.path.abspath(__file__), "temp_"+str(int(time.time())))
    if not os.path.exists(temp_path):
        os.makedirs(temp_path)
    inputpdf = PdfFileReader(open(file_path, "rb"))
    if inputpdf.getIsEncrypted():
        inputpdf.decrypt('')
    for i in xrange(inputpdf.numPages):
        output = PdfFileWriter()
        output.addPage(inputpdf.getPage(i))
        with open(os.path.join(temp_path,'%s.pdf'% i),"wb") as outputStream:
            output.write(outputStream)

它适用于小文件 但问题是当 PDF 超过 152 页时它只拆分前 0-151 页并在之后停止。它还会在我杀死它之前吸出系统的所有内存。

请让我知道我做错了什么或问题出在哪里，我该如何纠正？

Answer 1

看来问题出在 pyPdf 本身。我切换到 pyPDF2 并且有效。

pyPdf 拆分大 PDF 在拆分 PDF 的 150-152 页后失败

pyPdf Splitting Large PDF fails after splitting 150-152 pages of the PDF

pdf

pypdf

python-2.7