在 powerpoint 目录中搜索关键字

searching for keywords in a directory of powerpoints

我正在尝试创建一个 python 程序来搜索 powerpoint 幻灯片的关键字。这是我目前所拥有的,但我一直收到错误消息,告诉我它正在寻找一个 zip 文件 "zipfile.BadZipFile: File is not a zip file" 谢谢

from pptx import Presentation
import os

def main():

    while(True):
        search = input("Keyword: ")
        result = []
        for filename in os.listdir():
            f = open(filename)
            pres = Presentation(f)
            for slide in pres.slides:
                for shape in slide.shapes:
                    if not shape.has_text_frame:
                        continue
                    for paragraph in shape.text_frame.paragraphs:
                        for run in paragraph.runs:
                            if search in run.text:
                                result.append(run.text)
                                result.append(" - ")
                                result.append(filename)
                            else:
                                continue
            f.close()
            print(result)
if __name__ == '__main__':
    main()

在将文件传递给 Presentation() 之前不需要打开文件。只需传递文件名即可。

prs = Presentation(filename)

还要确保您以这种方式使用的所有文件实际上都是 PPTX 文件,可能有几行看起来像:

for filename in os.listdir():
    if not filename.endswith('.pptx'):
        continue
    prs = Presentation(filename)

如果您出于某种原因确实想使用打开的文件,则需要以二进制模式打开它们:

f = open(filename, 'rb')