使用 iTextSharp 提取 PDF 文档的 ID

Question

我需要提取文档尾部提到的 PDF 标识符。但我无法获得该价值。例如。以下是我的 pdf 文件中提到的：

trailer
<</Size 196/Prev 370761/Root 160 0 R/Info 158 0 R/ID[<30EB7FCBB6756E461176FBBD0CEBA7B9><DB67D6D43AE0FA4FBF8CC171FC66790A>]>>

我需要提取值 30EB7FCBB6756E461176FBBD0CEBA7B9。使用 PdfReader.Trailer 如果一个键为 'ID'，我得到一个字典类型的对象，但我无法从中获得上述所需的值。

Answer 1

Using the PdfReader.Trailer I get a dictionary type of object if one key as 'ID' but I am not able to get the above required value from it.

看着PdfReader.Trailer你快到了：

public PdfArray GetId(string FileName)
{
    using (PdfReader pdfReader = new PdfReader(FileName))
    {
        return pdfReader.Trailer.GetAsArray(PdfName.ID);
    }
}

此方法 returns 文档的 ID，两个字节字符串的数组。

您似乎对 ID 的十六进制表示感兴趣。你可以这样输出：

public void PrintId(PdfArray Id)
{
    if (Id != null)
    {
        StringBuilder builder = new StringBuilder();
        builder.Append("ID: ");
        foreach (PdfObject o in Id)
        {
            builder.Append("<");
            foreach (byte b in ((PdfString)o).GetBytes())
                builder.AppendFormat("{0:X}", b);
            builder.Append(">");
        }
        Console.WriteLine(builder.ToString());
    }
}

（我不是很精通.Net，所以可能有很多更优雅的方法来创建字节数组的十六进制转储。）

使用 iTextSharp 提取 PDF 文档的 ID

Extract ID of a PDF document using iTextSharp

c#

pdf

asp.net

itextsharp