如何在 google colab 中读取 .docx 文件?

How do I read a .docx file in google colab?

我正在尝试将 docx 文件读入 google collab,因为我的装有 anaconda 的主计算机已经停止维护。我正在尝试使用 python-docx 模块,但据我所知,我不能只在 google collab

中 pip install python-docx

'''

import docx

def getText(filename):
    doc = docx.Document(filename)
    fullText = []
    for para in doc.paragraphs:
        fullText.append(para.text)
    return '\n'.join(fullText)

docxString = getText("week_8_document1.docx")

'''

有什么想法吗?

尝试以下方法;希望有效:

#Install python-docx
!pip install python-docx #<-- Yes you can directly install in Colab

#Import the tools
import docx
from google.colab import files

uploaded = files.upload() #<-- Select the file you want to upload
file_name = '[whatever your file is called here].docx' #<-- Change filename to your file
doc = docx.Document(file_name)

加载文档后,您可以按段落或表格等方式访问文本。祝老板好运