如何使用 apache pdfbox api 在 Java 中将 PDF 的字节数组转换为 jpg 图像的字节数组？

Question

我在我的项目中被分配了这个任务。我正在从服务获取 PDF 的字节数组，我必须将其转换为 JPG 图像的字节数组和 JPG 的 return 字节数组。谁能帮帮我吗？

我尝试了以下将 PDF 字节数组转换为 JPG 而不是 returnJPG 字节数组的解决方案。

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;

import org.apache.pdfbox.util.PDFImageWriter;

import org.apache.pdfbox.pdmodel.PDDocument;

public class DocumentService{
    public byte[] convertPDFtoImage(byte[] bytes) {
        InputStream targetStream = new ByteArrayInputStream(bytes);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        PDDocument document = null;
        try {
            document = PDDocument.load(targetStream);
            PDFImageWriter writer = new PDFImageWriter();
            writer.writeImage(document, "jpg", null, 1, 2, "C:\Shailesh\aaa");
        } catch (Exception e) {
            log.error(e.getMessage(), e);
            e.printStackTrace();
        }
    }
}

Answer 1

我找到了一个解决方案，但是 renderer.renderImageWithDPI(pageNumber, 300) 方法将 页码作为方法参数，它只能 转换一页一次PDF。但我需要 完整的 PDf 以字节数组的形式转换为 JPG。

import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;

import javax.imageio.ImageIO;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;

public class DocumentService {

    public byte[] convertPDFtoImage(byte[] bytesPDF) {
        InputStream targetStream = new ByteArrayInputStream(bytesPDF);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        PDDocument document = null;
        try {
            document = PDDocument.load(targetStream);
            PDFRenderer renderer = new PDFRenderer(document);
            int pageNumber = 1;
            BufferedImage bi = renderer.renderImageWithDPI(pageNumber, 300);
            ImageIO.write(bi, "jpg", baos);
            baos.flush();
        } catch (Exception e) {
            log.error(e.getMessage(), e);
        } finally {
            if (document != null) {
                try {
                    document.close();
                    baos.close();
                    log.info("End convert PDF to Images process");
                } catch (IOException e) {
                    log.error(e.getMessage());
                }
            }
        }
        return baos.toByteArray();
    }
}

如何使用 apache pdfbox api 在 Java 中将 PDF 的字节数组转换为 jpg 图像的字节数组？

How to convert byte array of PDF to byte array of jpg image in Java using apache pdfbox api?

java

apache

file-io

pdfbox