iTextSharp 从 Table 的内容 PDF C# 中获取操作
iTextSharp get Actions from Table Of Content PDF C#
我有一个带有目录的 PDF:
使用 iTextSharp.dll 我正在尝试获取注释然后对这些注释执行操作。然后我想manipulate/change link 指向另一个页面。例如,如果目录中的第 1 章 指向第 5 页,我希望它在我单击 link 时指向第 2 页。由于某种原因,对注释的操作为空,因此我无法操作此数据。下面的代码有效,但一直提供空 action。我不明白这是为什么。
复制有问题的 pdf
- 创建一个 3 页的 word 文档
- 第1页为table内容,第2页第1章,第3页第2章
- 导出为 PDF
- 获得 PDF 后,TOC 应该是 'clickable'。
然后我希望能够操纵它点击的位置。谢谢。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
using System.Collections;
namespace PDFLinks
{
class Program
{
//Folder that we are working in
//private static readonly string WorkingFolder = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Hyperlinked PDFs");
//Sample PDF
private static readonly string BaseFile = Path.Combine("C:\Temp", "TableOfContentsTest.pdf");
//Final file
private static readonly string OutputFile = Path.Combine("C:\Temp", "NewFile.pdf");
static void Main(string[] args)
{
//Setup some variables to be used later
PdfReader R = default(PdfReader);
int PageCount = 0;
//Open our reader
R = new PdfReader(BaseFile);
//Get the page cont
PageCount = R.NumberOfPages;
Console.WriteLine("Page Count= " + PageCount);
//Loop through each page
//for (int i = 1; i <= PageCount; i++)
//{
//Get the current page
PdfDictionary PageDictionary = R.GetPageN(1);
//Get all of the annotations for the current page
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
//Make sure we have something
if ((Annots == null) || (Annots.Length == 0))
{
Console.WriteLine("nothing");
}
//Loop through each annotation
if (Annots != null)
{
Console.WriteLine("ANNOTS Not Null" + Annots[0]);
foreach (PdfObject A in Annots.ArrayList)
{
//Convert the itext-specific object as a generic PDF object
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(A);
//Make sure this annotation has a link
if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
continue;
//Make sure this annotation has an ACTION
if (AnnotationDictionary.Get(PdfName.A) == null)
continue;
if (AnnotationDictionary.Get(PdfName.A) != null)
{
Console.WriteLine("ACTION Not Null");
}
//Get the ACTION for the current annotation
PdfDictionary AnnotationAction = AnnotationDictionary.GetAsDict(PdfName.A);
// Test if it is a URI action (There are tons of other types of actions,
// some of which might mimic URI, such as JavaScript,
// but those need to be handled seperately)
if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI))
{
PdfString Destination = AnnotationAction.GetAsString(PdfName.URI);
string url1 = Destination.ToString();
}
}
}
//}
}
}
}
目的地目的地
在您的 Link 注释中,您只查找 A 动作条目,但也可能有 Destination 入口,cf. PDF 规范 ISO 32000-2:
A
dictionary
(Optional; PDF 1.1) An action that shall be performed when the link annotation is activated (see 12.6, "Actions").
Dest
array, name or byte string
(Optional; not permitted if an A entry is present) A destination that shall be displayed when the annotation is activated (12.3.2, "Destinations").
(ISO 32000-2 Table 176 — link 注释的附加条目)
目的地有多种类型,请参见。 ,特别是那里的规范引用,但处理其中一些类型的代码也可能很有趣。
一个动作
即使 Link 具有 A 功能,您也只考虑 a) 第一个动作和 b) 类型的动作 URI.
多次操作
Links 可以触发一系列动作,后续动作参考第一个动作,cf.规范
Next
dictionary or array
(Optional; PDF 1.2) The next action or sequence of actions that shall be performed after the action represented by this dictionary. The value is either a single action dictionary or an array of action dictionaries that shall be performed in order; see Note 1 for further discussion.
NOTE 1 The action dictionary’s Next entry (PDF 1.2) allows sequences of actions to be chained together. For example, the effect of clicking a link annotation with the mouse might be to play a sound, jump to a new page, and start up a movie. Note that the Next entry is not restricted to a single action but may contain an array of actions, each of which in turn may have a Next entry of its own. The actions may thus form a tree instead of a simple linked list.
(ISO 32000-2 Table 196 — 所有动作词典通用的条目)
如 NOTE 中的示例所示,跳转到新页面不一定是 Link 的第一个操作,因此对于您的任务,您应该明确检查你的 Link.
的所有操作
操作类型
A uniform resource identifier (URI) is a string that identifies (resolves to) a resource on the Internet — typically a file that is the destination of a hypertext link, although it may also resolve to a query or other entity. (URIs are described in Internet RFC 3986, Uniform Resource Identifiers (URI): Generic Syntax.)
A URI action causes a URI to be resolved.
(ISO 32000-2,第 12.6.4.8 节 URI 操作)
因此,URI 操作不太可能在 PDF 的目录中找到。您最好寻找 GoTo 操作。
A go-to action changes the view to a specified destination (page, location, and magnification factor). "Table 202 — Additional entries specific to a go-to action" shows the action dictionary entries specific to this type of action.
NOTE Specifying a go-to action in the A entry of a link annotation or outline item (see "Table 176 — Additional entries specific to a link annotation" and "Table 151 — Entries in an outline item dictionary") has the same effect as specifying the destination directly with the Dest entry.
(ISO 32000-2 第 12.6.4.2 节 Go-To 操作)
D
name, byte string, or array
(Required) The destination to jump to (see 12.3.2, "Destinations").
(ISO 32000-2 Table 202 — 特定于执行操作的附加条目)
因此,当检查 GoTo 操作时,您最终必须处理与检查顶部讨论的直接 link 目的地时相同类型的目标规范回答。
我有一个带有目录的 PDF:
使用 iTextSharp.dll 我正在尝试获取注释然后对这些注释执行操作。然后我想manipulate/change link 指向另一个页面。例如,如果目录中的第 1 章 指向第 5 页,我希望它在我单击 link 时指向第 2 页。由于某种原因,对注释的操作为空,因此我无法操作此数据。下面的代码有效,但一直提供空 action。我不明白这是为什么。 复制有问题的 pdf
- 创建一个 3 页的 word 文档
- 第1页为table内容,第2页第1章,第3页第2章
- 导出为 PDF
- 获得 PDF 后,TOC 应该是 'clickable'。
然后我希望能够操纵它点击的位置。谢谢。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
using System.Collections;
namespace PDFLinks
{
class Program
{
//Folder that we are working in
//private static readonly string WorkingFolder = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Hyperlinked PDFs");
//Sample PDF
private static readonly string BaseFile = Path.Combine("C:\Temp", "TableOfContentsTest.pdf");
//Final file
private static readonly string OutputFile = Path.Combine("C:\Temp", "NewFile.pdf");
static void Main(string[] args)
{
//Setup some variables to be used later
PdfReader R = default(PdfReader);
int PageCount = 0;
//Open our reader
R = new PdfReader(BaseFile);
//Get the page cont
PageCount = R.NumberOfPages;
Console.WriteLine("Page Count= " + PageCount);
//Loop through each page
//for (int i = 1; i <= PageCount; i++)
//{
//Get the current page
PdfDictionary PageDictionary = R.GetPageN(1);
//Get all of the annotations for the current page
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
//Make sure we have something
if ((Annots == null) || (Annots.Length == 0))
{
Console.WriteLine("nothing");
}
//Loop through each annotation
if (Annots != null)
{
Console.WriteLine("ANNOTS Not Null" + Annots[0]);
foreach (PdfObject A in Annots.ArrayList)
{
//Convert the itext-specific object as a generic PDF object
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(A);
//Make sure this annotation has a link
if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
continue;
//Make sure this annotation has an ACTION
if (AnnotationDictionary.Get(PdfName.A) == null)
continue;
if (AnnotationDictionary.Get(PdfName.A) != null)
{
Console.WriteLine("ACTION Not Null");
}
//Get the ACTION for the current annotation
PdfDictionary AnnotationAction = AnnotationDictionary.GetAsDict(PdfName.A);
// Test if it is a URI action (There are tons of other types of actions,
// some of which might mimic URI, such as JavaScript,
// but those need to be handled seperately)
if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI))
{
PdfString Destination = AnnotationAction.GetAsString(PdfName.URI);
string url1 = Destination.ToString();
}
}
}
//}
}
}
}
目的地目的地
在您的 Link 注释中,您只查找 A 动作条目,但也可能有 Destination 入口,cf. PDF 规范 ISO 32000-2:
A dictionary (Optional; PDF 1.1) An action that shall be performed when the link annotation is activated (see 12.6, "Actions").
Dest array, name or byte string (Optional; not permitted if an A entry is present) A destination that shall be displayed when the annotation is activated (12.3.2, "Destinations").
(ISO 32000-2 Table 176 — link 注释的附加条目)
目的地有多种类型,请参见。
一个动作
即使 Link 具有 A 功能,您也只考虑 a) 第一个动作和 b) 类型的动作 URI.
多次操作
Links 可以触发一系列动作,后续动作参考第一个动作,cf.规范
Next dictionary or array (Optional; PDF 1.2) The next action or sequence of actions that shall be performed after the action represented by this dictionary. The value is either a single action dictionary or an array of action dictionaries that shall be performed in order; see Note 1 for further discussion.
NOTE 1 The action dictionary’s Next entry (PDF 1.2) allows sequences of actions to be chained together. For example, the effect of clicking a link annotation with the mouse might be to play a sound, jump to a new page, and start up a movie. Note that the Next entry is not restricted to a single action but may contain an array of actions, each of which in turn may have a Next entry of its own. The actions may thus form a tree instead of a simple linked list.
(ISO 32000-2 Table 196 — 所有动作词典通用的条目)
如 NOTE 中的示例所示,跳转到新页面不一定是 Link 的第一个操作,因此对于您的任务,您应该明确检查你的 Link.
的所有操作操作类型
A uniform resource identifier (URI) is a string that identifies (resolves to) a resource on the Internet — typically a file that is the destination of a hypertext link, although it may also resolve to a query or other entity. (URIs are described in Internet RFC 3986, Uniform Resource Identifiers (URI): Generic Syntax.)
A URI action causes a URI to be resolved.
(ISO 32000-2,第 12.6.4.8 节 URI 操作)
因此,URI 操作不太可能在 PDF 的目录中找到。您最好寻找 GoTo 操作。
A go-to action changes the view to a specified destination (page, location, and magnification factor). "Table 202 — Additional entries specific to a go-to action" shows the action dictionary entries specific to this type of action.
NOTE Specifying a go-to action in the A entry of a link annotation or outline item (see "Table 176 — Additional entries specific to a link annotation" and "Table 151 — Entries in an outline item dictionary") has the same effect as specifying the destination directly with the Dest entry.
(ISO 32000-2 第 12.6.4.2 节 Go-To 操作)
D name, byte string, or array (Required) The destination to jump to (see 12.3.2, "Destinations").
(ISO 32000-2 Table 202 — 特定于执行操作的附加条目)
因此,当检查 GoTo 操作时,您最终必须处理与检查顶部讨论的直接 link 目的地时相同类型的目标规范回答。