PDF 与 itext 和 pdfbox 合并
PDF merging with itext and pdfbox
我有一个多模块 maven 项目,因为有一个请求生成过程,在这个过程中有一些 vaadin 的上传组件,我们正在上传一些只能是 png、jpgs、pdf 和 bmp 的文件.
在这个过程的最后,我将所有文档类型合并为一个 pdf,然后使用文件下载器下载它。
我在按钮点击事件上调用的函数是:
/**
* This function is responsible for getting
* all documents from request and merge
* them in a single pdf file for
* download purposes
* @throws Exception
*/
protected void downloadMergedDocument() throws Exception {
// Calling create pdf function for merged pdf
createPDF();
// Setting the merged file as a resource for file downloader
Resource myResource = new FileResource(new File (mergedReportPath +request.getWebProtocol()+ ".pdf"));
FileDownloader fileDownloader = new FileDownloader(myResource);
// Extending the download button for download
fileDownloader.extend(downloadButton);
}
/**
* This function is responsible for providing
* the PDF related to a particular request that
* contains all the documents merged inside it
* @throws Exception
*/
private void createPDF() throws Exception {
try{
// Getting the current request
request = evaluationRequestUI.getRequest();
// Fetching all documents of the request
Collection<DocumentBean> docCollection = request.getDocuments();
// Initializing Document of using itext library
Document doc = new Document();
// Setting PdfWriter for getting the merged images file
PdfWriter.getInstance(doc, new FileOutputStream(mergedReportPath+ "/mergedImages_" + request.getWebProtocol()+ ".pdf"));
// Opening document
l_doc.open();
/**
* Here iterating on document collection for the images type
* document for merging them into one pdf
*/
for (DocumentBean documentBean : docCollection) {
byte[] documents = documentBean.getByteArray();
if(documentBean.getFilename().toLowerCase().contains("png") ||
documentBean.getFilename().toLowerCase().contains("jpeg") ||
documentBean.getFilename().toLowerCase().contains("jpg") ||
documentBean.getFilename().toLowerCase().contains("bmp")){
Image img = Image.getInstance(documents);
doc.setPageSize(img);
doc.newPage();
img.setAbsolutePosition(0, 0);
doc.add(img);
}
}
// Closing the document
doc.close();
/**
* Here we get all the images type documents merged into
* one pdf, now moving to pdfbox for searching the pdf related
* document types in the request and merging the above resultant
* pdf and the pdf document in the request into one pdf
*/
PDFMergerUtility utility = new PDFMergerUtility();
// Adding the above resultant pdf as a source
utility.addSource(new File(mergedReportPath+ "/mergedImages_" + request.getWebProtocol()+ ".pdf"));
// Iterating for the pdf document types in the collection
for (DocumentBean documentBean : docCollection) {
byte[] documents = documentBean.getByteArray();
if(documentBean.getFilename().toLowerCase().contains("pdf")){
utility.addSource(new ByteArrayInputStream(documents));
}
}
// Here setting the final pdf name
utility.setDestinationFileName(mergedReportPath +request.getWebProtocol()+ ".pdf");
// Here final merging and then result
utility.mergeDocuments();
}catch(Exception e){
m_logger.error("CATCH", e);
throw e;
}
}
注意:mergedReportPath是为pdf文件定义的存放路径,然后
从那里检索以供下载。
现在,我有两个问题:
- 当我为第一个请求执行此过程时,它会在
目标文件夹,但它不下载它。
- 当我再次为第二个请求执行此过程时,它卡在了
utility.mergedocuments(),我的意思是如果它发现 pdf 已经
存在于它卡住的目标文件夹中。我不知道在哪里
问题是。请帮忙
在您问题的评论部分,您已阐明您不需要磁盘上的文件,但您想将 PDF 发送到浏览器。您想知道如何实现这一目标。官方文档对此有解释:How can I serve a PDF to a browser without storing a file on the server side?
这是在内存中创建 PDF 的方式:
// step 1
Document document = new Document();
// step 2
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfWriter.getInstance(document, baos);
// step 3
document.open();
// step 4
document.add(new Paragraph("Hello"));
// step 5
document.close();
合并 PDF 使用 PdfCopy
完成:How to merge documents correctly?
您需要对这些示例应用与上述相同的原则:将 FileOutputStream
替换为 ByteArrayOutputStream
.
现在您有存储在 baos
对象中的 PDF 字节。我们可以这样发送到浏览器:
// setting some response headers
response.setHeader("Expires", "0");
response.setHeader("Cache-Control",
"must-revalidate, post-check=0, pre-check=0");
response.setHeader("Pragma", "public");
// setting the content type
response.setContentType("application/pdf");
// the contentlength
response.setContentLength(baos.size());
// write ByteArrayOutputStream to the ServletOutputStream
OutputStream os = response.getOutputStream();
baos.writeTo(os);
os.flush();
os.close();
如果您还有其他问题,请务必阅读 documentation。
在PDFBox 2.0版本中,您可以使用setDestinationStream()
设置输出流。因此,您只需调用
response.setContentType("application/pdf");
OutputStream os = response.getOutputStream();
utility.setDestinationStream(os);
utility.mergeDocuments();
os.flush();
os.close();
您不能以这种方式设置响应大小;如果必须,请使用 ByteArrayOutputStream
就像 Bruno 的回答或 this one.
我有一个多模块 maven 项目,因为有一个请求生成过程,在这个过程中有一些 vaadin 的上传组件,我们正在上传一些只能是 png、jpgs、pdf 和 bmp 的文件. 在这个过程的最后,我将所有文档类型合并为一个 pdf,然后使用文件下载器下载它。
我在按钮点击事件上调用的函数是:
/**
* This function is responsible for getting
* all documents from request and merge
* them in a single pdf file for
* download purposes
* @throws Exception
*/
protected void downloadMergedDocument() throws Exception {
// Calling create pdf function for merged pdf
createPDF();
// Setting the merged file as a resource for file downloader
Resource myResource = new FileResource(new File (mergedReportPath +request.getWebProtocol()+ ".pdf"));
FileDownloader fileDownloader = new FileDownloader(myResource);
// Extending the download button for download
fileDownloader.extend(downloadButton);
}
/**
* This function is responsible for providing
* the PDF related to a particular request that
* contains all the documents merged inside it
* @throws Exception
*/
private void createPDF() throws Exception {
try{
// Getting the current request
request = evaluationRequestUI.getRequest();
// Fetching all documents of the request
Collection<DocumentBean> docCollection = request.getDocuments();
// Initializing Document of using itext library
Document doc = new Document();
// Setting PdfWriter for getting the merged images file
PdfWriter.getInstance(doc, new FileOutputStream(mergedReportPath+ "/mergedImages_" + request.getWebProtocol()+ ".pdf"));
// Opening document
l_doc.open();
/**
* Here iterating on document collection for the images type
* document for merging them into one pdf
*/
for (DocumentBean documentBean : docCollection) {
byte[] documents = documentBean.getByteArray();
if(documentBean.getFilename().toLowerCase().contains("png") ||
documentBean.getFilename().toLowerCase().contains("jpeg") ||
documentBean.getFilename().toLowerCase().contains("jpg") ||
documentBean.getFilename().toLowerCase().contains("bmp")){
Image img = Image.getInstance(documents);
doc.setPageSize(img);
doc.newPage();
img.setAbsolutePosition(0, 0);
doc.add(img);
}
}
// Closing the document
doc.close();
/**
* Here we get all the images type documents merged into
* one pdf, now moving to pdfbox for searching the pdf related
* document types in the request and merging the above resultant
* pdf and the pdf document in the request into one pdf
*/
PDFMergerUtility utility = new PDFMergerUtility();
// Adding the above resultant pdf as a source
utility.addSource(new File(mergedReportPath+ "/mergedImages_" + request.getWebProtocol()+ ".pdf"));
// Iterating for the pdf document types in the collection
for (DocumentBean documentBean : docCollection) {
byte[] documents = documentBean.getByteArray();
if(documentBean.getFilename().toLowerCase().contains("pdf")){
utility.addSource(new ByteArrayInputStream(documents));
}
}
// Here setting the final pdf name
utility.setDestinationFileName(mergedReportPath +request.getWebProtocol()+ ".pdf");
// Here final merging and then result
utility.mergeDocuments();
}catch(Exception e){
m_logger.error("CATCH", e);
throw e;
}
}
注意:mergedReportPath是为pdf文件定义的存放路径,然后
从那里检索以供下载。
现在,我有两个问题:
- 当我为第一个请求执行此过程时,它会在 目标文件夹,但它不下载它。
- 当我再次为第二个请求执行此过程时,它卡在了 utility.mergedocuments(),我的意思是如果它发现 pdf 已经 存在于它卡住的目标文件夹中。我不知道在哪里 问题是。请帮忙
在您问题的评论部分,您已阐明您不需要磁盘上的文件,但您想将 PDF 发送到浏览器。您想知道如何实现这一目标。官方文档对此有解释:How can I serve a PDF to a browser without storing a file on the server side?
这是在内存中创建 PDF 的方式:
// step 1
Document document = new Document();
// step 2
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfWriter.getInstance(document, baos);
// step 3
document.open();
// step 4
document.add(new Paragraph("Hello"));
// step 5
document.close();
合并 PDF 使用 PdfCopy
完成:How to merge documents correctly?
您需要对这些示例应用与上述相同的原则:将 FileOutputStream
替换为 ByteArrayOutputStream
.
现在您有存储在 baos
对象中的 PDF 字节。我们可以这样发送到浏览器:
// setting some response headers
response.setHeader("Expires", "0");
response.setHeader("Cache-Control",
"must-revalidate, post-check=0, pre-check=0");
response.setHeader("Pragma", "public");
// setting the content type
response.setContentType("application/pdf");
// the contentlength
response.setContentLength(baos.size());
// write ByteArrayOutputStream to the ServletOutputStream
OutputStream os = response.getOutputStream();
baos.writeTo(os);
os.flush();
os.close();
如果您还有其他问题,请务必阅读 documentation。
在PDFBox 2.0版本中,您可以使用setDestinationStream()
设置输出流。因此,您只需调用
response.setContentType("application/pdf");
OutputStream os = response.getOutputStream();
utility.setDestinationStream(os);
utility.mergeDocuments();
os.flush();
os.close();
您不能以这种方式设置响应大小;如果必须,请使用 ByteArrayOutputStream
就像 Bruno 的回答或 this one.