读取 Python 中的 .txt 文件，避免用特殊字符替换文件中的原始字符

Question

我想知道如何以特殊字符不会覆盖我的 .txt 文件中的内容的方式读取 .txt 文件，以便我可以保留原始文件内容

我正在使用以下代码行：

with open('D:/nap31.txt') as gh:
    line = True
    while line:
        line = gh.readline()

来自 nap31.txt 文件的示例内容：

Teda Production Site Oranienburg Lehnitzstr. 70 – 98 16515 Oranienburg France packaging

Zene AB Gärtunavägen SE-151 85 Södertälje SWEDEN Testing

使用上述代码打开文件并读取后，内容变为：

Teda Production Site Oranienburg Lehnitzstr. 70 â€“ 98 16515 Oranienburg France packaging

ï»¿Zene AB GÃ¤rtunavÃ¤gen SE-151 85 SÃ¶dertÃ¤lje SWEDEN Testing

所以 – 正在替换我文件中的“-”，同样其他特殊字符正在替换其他内容。谁能帮我解决这个问题

Answer 1

当您在 Python 中打开文件时，默认编码是 ANSI，它不支持这些特定字符。因此，您需要将编码更改为 utf-8。为此，只需将您的代码更改为：

with open('D:/nap31.txt', encoding='utf-8') as gh:
    line = True
    while line:
        line = gh.readline()

读取 Python 中的 .txt 文件，避免用特殊字符替换文件中的原始字符

Read a .txt file in Python avoiding special characters to replace original characters inside the file

python

unicode

utf-8

special-characters

python-3.x