PyMuPDF 如何删除注释?
PyMuPDF how do I remove annotations?
我正在使用 PyMuPDF 并尝试循环遍历字符串列表并在拍摄图像并移动到下一个字符串之前突出显示它们。
下面的代码可以满足我的需要,但注释在每次循环后仍然存在,我想在拍摄图像后删除它们。
下面的示例图像显示单词 "command" 突出显示,但前面的字符串 "Images" 和 "filename" 仍然突出显示,因为我将数百个这些图像编译成报告,我想制作更显眼。
有没有类似page.remove(高亮)的东西?
for pi in range(doc.pageCount):
page = doc[pi]
for tag in text_list:
text = tag
text_instances = page.searchFor(text)
five_percent_height = (page.rect.br.y - page.rect.tl.y)*0.05
five_percent_width = (page.rect.br.x - page.rect.tl.x)*0.05
for inst in text_instances:
inst_counter += 1
highlight = page.addSquigglyAnnot(inst)
tl_pt = fitz.Point(max(page.rect.tl.x, inst.tl.x - five_percent_width), max(page.rect.tl.y, inst.tl.y - five_percent_height))
br_pt = fitz.Point(min(page.rect.br.x, inst.br.x + five_percent_width), min(page.rect.br.y, inst.br.y + five_percent_height))
hl_clip = fitz.Rect(tl_pt, br_pt)
zoom_mat = fitz.Matrix(4, 4)
pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
>I want to remove the annotation here
我发现一个可接受的解决方案是在截屏后将不透明度设置为 0%。
pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
highlight.setOpacity(0)
highlight.update()
这样做:
annot = page.firstAnnot
while annot:
annot = page.delete_annot(annot)
该方法在删除的注释之后传递注释。
Jorj 的方法很好。但是,从文档中还有其他选项:
https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-read-and-update-pdf-objects
This method can also be used to remove a key from the xref dictionary by setting its value to null: The following will remove the rotation specification from the page: doc.xref_set_key(page.xref, "Rotate", "null")
. Similarly, to remove all links, annotations and fields from a page, use doc.xref_set_key(page.xref, "Annots", "null")
. Because Annots by definition is an array, setting en empty array with the statement doc.xref_set_key(page.xref, "Annots", "[]")
would do the same job in this case.
我正在使用 PyMuPDF 并尝试循环遍历字符串列表并在拍摄图像并移动到下一个字符串之前突出显示它们。
下面的代码可以满足我的需要,但注释在每次循环后仍然存在,我想在拍摄图像后删除它们。
下面的示例图像显示单词 "command" 突出显示,但前面的字符串 "Images" 和 "filename" 仍然突出显示,因为我将数百个这些图像编译成报告,我想制作更显眼。
有没有类似page.remove(高亮)的东西?
for pi in range(doc.pageCount):
page = doc[pi]
for tag in text_list:
text = tag
text_instances = page.searchFor(text)
five_percent_height = (page.rect.br.y - page.rect.tl.y)*0.05
five_percent_width = (page.rect.br.x - page.rect.tl.x)*0.05
for inst in text_instances:
inst_counter += 1
highlight = page.addSquigglyAnnot(inst)
tl_pt = fitz.Point(max(page.rect.tl.x, inst.tl.x - five_percent_width), max(page.rect.tl.y, inst.tl.y - five_percent_height))
br_pt = fitz.Point(min(page.rect.br.x, inst.br.x + five_percent_width), min(page.rect.br.y, inst.br.y + five_percent_height))
hl_clip = fitz.Rect(tl_pt, br_pt)
zoom_mat = fitz.Matrix(4, 4)
pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
>I want to remove the annotation here
我发现一个可接受的解决方案是在截屏后将不透明度设置为 0%。
pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
highlight.setOpacity(0)
highlight.update()
这样做:
annot = page.firstAnnot
while annot:
annot = page.delete_annot(annot)
该方法在删除的注释之后传递注释。
Jorj 的方法很好。但是,从文档中还有其他选项:
https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-read-and-update-pdf-objects
This method can also be used to remove a key from the xref dictionary by setting its value to null: The following will remove the rotation specification from the page:
doc.xref_set_key(page.xref, "Rotate", "null")
. Similarly, to remove all links, annotations and fields from a page, usedoc.xref_set_key(page.xref, "Annots", "null")
. Because Annots by definition is an array, setting en empty array with the statementdoc.xref_set_key(page.xref, "Annots", "[]")
would do the same job in this case.