制作最后一段 - 活动指针

Question

我正在尝试用 markdown 撰写我的研究作品，我所在的机构要求以 Word 文档格式提交。我决定使用 python-docx 包来自动执行此任务。

但是，我正在努力完成一些特定的任务，比如将数据添加到文件末尾？

所以我现在在这里。


def merge(docx, files):
    """ merges other docx files into parent docx document """
    docx._body.clear_content() 

    elements = []
    for idx, file in enumerate(files):
        donor = Document(file)
        donor.add_page_break()

        for element in donor.element.body:
            elements.append(element)

    for element in elements: 
        docx.element.body.append(element)

# base styles 
document = Document("docx/base.docx")

# adding two preformated files with really fragile formatting.
merge(document, ["docx/Tytulka.docx", "docx/Zavdania.docx"])

document.add_paragraph("hey")
document.save("tmp_result.docx")

所以我在 tmp_result.docx 中得到的是 hey -> content from 1st file, content from 2nd File。

我检查了代码并成功使用 insert_paragraf_after*，它在文件末尾添加了一个段落。

所以这里有一个问题 - 我如何 ask/trick 文档对象使用最后一段作为当前元素指针？它的默认行为应该有效，但我更改了合并文档的结构，并将新内容添加到文件的第一段。

我尝试了下一个技巧，但结果出乎意料的不尽如人意**，之后，我决定不再玩API（word和python-docx）我不明白。

# trick I use to move active paragraph to the end.

def merge(docx, files):
    docx._body.clear_content()

    elements = []
    for idx, file in enumerate(files):
        donor = Document(file)
        donor.add_page_break()

        for element in donor.element.body:
            elements.append(element)

    for element in elements:
        # moving last paragraph to the end of file.
        tmp = docx.element.body[-1]
        docx.element.body[-1] = element
        docx.element.body.append(tmp)

# base styles 
document = Document("docx/base.docx")

# adding two preformated files with really fragile formatting.
merge(document, ["docx/Tytulka.docx", "docx/Zavdania.docx"])

document.add_paragraph("hey")
document.save("tmp_result.docx")

我希望我能花更多时间研究 Word 规范和 python-docx 代码，但我真的没有。所以这里有一个问题：

如何指向 python-docx 在特定（最后）段落之后写？

ANSWER/SOLUTION记入scanny

The problem with just appending to the body element is there is a "sentinel" sectPr element at the end of the body and it needs to stay there (like not have paragraphs after it). by @scanny

有了这个有价值的信息，我做了下一步。


def merge(docx, files):
    """
    Merge existing docx files into docx.
    """
    docx._body.clear_content()

    elements = []
    for idx, file in enumerate(files):
        donor = Document(file)
        donor.add_page_break()

        # all except donor sentinel sectPr
        for element in donor.element.body[:-1]:
            elements.append(element)

    # moving docx centinel to the end and adding elements from
    # donors
    for element in elements:
        tmp = docx.element.body[-1]
        docx.element.body[-1] = element
        docx.element.body.append(tmp)


if __name__ == "__main__":

    # addyng title page and preformated docs files.
    document = Document("docx/base.docx")
    merge(document, ["docx/Tytulka.docx", "docx/Zavdania.docx"])

    # document.add_paragraph("hey")

    # open for tests
    # os.system("kill -9 $(ps -e -o pid,args | grep Word.app | awk '{print }' | head -1)")
    # this part accepts curent document
    # transform markdown files that fits to pattern by adding them
    # to the docx
    # save and open document.
    Builder(document).build("texts/13*.md").save("tmp_result.docx").open()

作为结果 Content of the 1st file -> Conent of the 2nd File -> Markdown generated content

赢！赢！赢！

* 您不会在包中找到方法 insert_paragraf_after，但它与 insert_paragraf_before 完全相同，唯一的区别是段落创建并插入到下一个段落（请参阅CT_Pclass的方法add_p_before，可以使用BaseOxmlElement的addnext)).
** 移动当前指针 p 的结果是下一个：content from 1st file -> hey -> content from 2nd File，这没有意义（因为我真的不知道 API 的 Word 和 python-docx).

Answer 1

嗯，我不确定我是否完全理解你想要做什么，但我认为你要求的是这个：

last_p_in_document = document.paragraphs[-1]._p
p.addnext(new_p)
last_p_in_document = new_p
# ---etc.---

仅附加到正文元素的问题是正文末尾有一个 "sentinel" sectPr 元素，它需要留在那里（比如后面没有段落）。您可以采用的另一种方法是使用 sectPr = body[-1] 找到该元素，然后使用 sectPr.addprevious(next_element_to_be_added) 这实际上看起来是更简单的方法。 sectPr 将继续是 body 的最后一个子元素（因此您不必在每次插入元素后都重置它）并且您可以添加 table 元素以及段落元素相同的代码。

制作最后一段 - 活动指针

Make last paragraph - active pointer

python

ms-word

python-docx