iTextSharp 5.5.6 PdfCopy 失败 "Cannot access a closed file"
iTextSharp 5.5.6 PdfCopy Failing with "Cannot access a closed file"
这似乎与这个问题类似:Merging Tagged PDF without ruining the tags
我正在使用最新的 iTextSharp NuGet 包 (v5.5.6) 尝试合并两个标记的 PDF。调用 Document.Close()
时,我收到来自 PdfCopy.FlushIndirectObjects()
.
的 ObjectDisposedException
at System.IO.__Error.FileNotOpen()
at System.IO.FileStream.get_Position()
at iTextSharp.text.io.RAFRandomAccessSource.Get(Int64 position, Byte[] bytes, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\io\RAFRandomAccessSource.cs:line 96
at iTextSharp.text.io.IndependentRandomAccessSource.Get(Int64 position, Byte[] bytes, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\io\IndependentRandomAccessSource.cs:line 76
at iTextSharp.text.pdf.RandomAccessFileOrArray.Read(Byte[] b, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 235
at iTextSharp.text.pdf.RandomAccessFileOrArray.ReadFully(Byte[] b, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 264
at iTextSharp.text.pdf.RandomAccessFileOrArray.ReadFully(Byte[] b) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 254
at iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(PRStream stream, RandomAccessFileOrArray file) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfReader.cs:line 2406
at iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(PRStream stream) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfReader.cs:line 2443
at iTextSharp.text.pdf.PRStream.ToPdf(PdfWriter writer, Stream os) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PRStream.cs:line 224
at iTextSharp.text.pdf.PdfIndirectObject.WriteTo(Stream os) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfIndirectObject.cs:line 157
at iTextSharp.text.pdf.PdfWriter.PdfBody.Write(PdfIndirectObject indirect, Int32 refNumber, Int32 generation) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfWriter.cs:line 389
at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, Int32 refNumber, Int32 generation, Boolean inObjStm) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfWriter.cs:line 379
at iTextSharp.text.pdf.PdfCopy.WriteObjectToBody(PdfIndirectObject objecta) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 1238
at iTextSharp.text.pdf.PdfCopy.FlushIndirectObjects() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 1186
at iTextSharp.text.pdf.PdfCopy.FlushTaggedObjects() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 884
at iTextSharp.text.pdf.PdfDocument.Close() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfDocument.cs:line 825
这是产生异常的代码。如果我不调用 copy.SetTagged()
并且不将 true 作为第三个参数传递给 GetImportedPage()
,代码将毫无例外地执行,但会忽略所有标记。
using(var ms = new MemoryStream())
{
var doc = new Document();
var copy = new PdfSmartCopy(doc, ms);
copy.SetTagged();
doc.Open();
string[] files = new string[]{@"d:\tagged.pdf", @"d:\tagged.pdf"};
foreach(var f in files)
{
var reader = new PdfReader(f);
int pages = reader.NumberOfPages;
for(int i = 0; i < pages;)
copy.AddPage(copy.GetImportedPage(reader, ++i, true));
copy.FreeReader(reader);
reader.Close();
}
// ObjectDisposedException
doc.Close();
ms.Flush();
File.WriteAllBytes(@"d:\pdf.merged.v5.pdf", ms.ToArray());
}
查看 5.5.6 源代码分支,看起来 RAFRandomAccessSource.cs 第 96 行是罪魁祸首。
public virtual int Get(long position, byte[] bytes, int off, int len) {
if (position > length)
return -1;
// Not thread safe!
if (raf.Position != position)
raf.Position此时已经被销毁了,但是我不知道从哪里销毁了。
我希望我需要做的不仅仅是调用 copy.SetTagged()
并将 true
传递给 GetImportedPage()
来解决问题。
您过早关闭了 PdfReader
个实例。您只能触发:
reader.Close();
在关闭PdfSmartCopy
实例之后,因此您必须重新考虑在哪里创建不同的PdfReader
对象(不是 在循环内)。
不同的 PdfReader
实例必须保持打开状态的原因纯粹是技术性的:合并结构化树(存储所有标记信息的地方)并非易事。这只能在所有其他工作完成时发生。它需要访问单独文档的原始结构。如果关闭此类文档的 PdfReader
,将无法再检索该结构。
这似乎与这个问题类似:Merging Tagged PDF without ruining the tags
我正在使用最新的 iTextSharp NuGet 包 (v5.5.6) 尝试合并两个标记的 PDF。调用 Document.Close()
时,我收到来自 PdfCopy.FlushIndirectObjects()
.
ObjectDisposedException
at System.IO.__Error.FileNotOpen()
at System.IO.FileStream.get_Position()
at iTextSharp.text.io.RAFRandomAccessSource.Get(Int64 position, Byte[] bytes, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\io\RAFRandomAccessSource.cs:line 96
at iTextSharp.text.io.IndependentRandomAccessSource.Get(Int64 position, Byte[] bytes, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\io\IndependentRandomAccessSource.cs:line 76
at iTextSharp.text.pdf.RandomAccessFileOrArray.Read(Byte[] b, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 235
at iTextSharp.text.pdf.RandomAccessFileOrArray.ReadFully(Byte[] b, Int32 off, Int32 len) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 264
at iTextSharp.text.pdf.RandomAccessFileOrArray.ReadFully(Byte[] b) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\RandomAccessFileOrArray.cs:line 254
at iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(PRStream stream, RandomAccessFileOrArray file) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfReader.cs:line 2406
at iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw(PRStream stream) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfReader.cs:line 2443
at iTextSharp.text.pdf.PRStream.ToPdf(PdfWriter writer, Stream os) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PRStream.cs:line 224
at iTextSharp.text.pdf.PdfIndirectObject.WriteTo(Stream os) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfIndirectObject.cs:line 157
at iTextSharp.text.pdf.PdfWriter.PdfBody.Write(PdfIndirectObject indirect, Int32 refNumber, Int32 generation) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfWriter.cs:line 389
at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, Int32 refNumber, Int32 generation, Boolean inObjStm) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfWriter.cs:line 379
at iTextSharp.text.pdf.PdfCopy.WriteObjectToBody(PdfIndirectObject objecta) in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 1238
at iTextSharp.text.pdf.PdfCopy.FlushIndirectObjects() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 1186
at iTextSharp.text.pdf.PdfCopy.FlushTaggedObjects() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfCopy.cs:line 884
at iTextSharp.text.pdf.PdfDocument.Close() in d:\Downloads\itextsharp-master\src\core\iTextSharp\text\pdf\PdfDocument.cs:line 825
这是产生异常的代码。如果我不调用 copy.SetTagged()
并且不将 true 作为第三个参数传递给 GetImportedPage()
,代码将毫无例外地执行,但会忽略所有标记。
using(var ms = new MemoryStream())
{
var doc = new Document();
var copy = new PdfSmartCopy(doc, ms);
copy.SetTagged();
doc.Open();
string[] files = new string[]{@"d:\tagged.pdf", @"d:\tagged.pdf"};
foreach(var f in files)
{
var reader = new PdfReader(f);
int pages = reader.NumberOfPages;
for(int i = 0; i < pages;)
copy.AddPage(copy.GetImportedPage(reader, ++i, true));
copy.FreeReader(reader);
reader.Close();
}
// ObjectDisposedException
doc.Close();
ms.Flush();
File.WriteAllBytes(@"d:\pdf.merged.v5.pdf", ms.ToArray());
}
查看 5.5.6 源代码分支,看起来 RAFRandomAccessSource.cs 第 96 行是罪魁祸首。
public virtual int Get(long position, byte[] bytes, int off, int len) {
if (position > length)
return -1;
// Not thread safe!
if (raf.Position != position)
raf.Position此时已经被销毁了,但是我不知道从哪里销毁了。
我希望我需要做的不仅仅是调用 copy.SetTagged()
并将 true
传递给 GetImportedPage()
来解决问题。
您过早关闭了 PdfReader
个实例。您只能触发:
reader.Close();
在关闭PdfSmartCopy
实例之后,因此您必须重新考虑在哪里创建不同的PdfReader
对象(不是 在循环内)。
不同的 PdfReader
实例必须保持打开状态的原因纯粹是技术性的:合并结构化树(存储所有标记信息的地方)并非易事。这只能在所有其他工作完成时发生。它需要访问单独文档的原始结构。如果关闭此类文档的 PdfReader
,将无法再检索该结构。