如何打印 .txt / .py 文件具有和不具有我与之比较的另一个 .txt / .py 文件的那些词?

How to print those words that a .txt / .py file has and does not have another .txt / .py file with which I compare it?

我曾尝试使用此代码比较 2 个 .py 代码文件,但它仅限于给我最后几行代码,例如,如果文件 1 有 2014 行,文件 2 有 2004 行,那么它 returns file1 的最后 10 行,但这不是我需要的,我需要提取那些在 file1 中但不在 file2 中的行。

import shutil

file1 = 'bot-proto7test.py'
file2 = 'bot-proto7.py'

with open(file1, 'r') as file1:
    with open(file2) as file2:
        with open ("output.txt", "w") as out_file:
            file2.seek(0, 2)
            file1.seek(file2.tell())
            shutil.copyfileobj(file1, out_file)

您可以使用集合来做到这一点:

with open(file1, 'r') as f:
    set1 = {*f.readlines()}

with open(file2, 'r') as f:
    set2 = {*f.readlines()}

print(set1 - set2) # it contains only line that are in first file

顺便说一句。您可以使用单个 with 语句打开多个文件!

with open("f1.txt", "r") as f1, open("f2.txt", "r") as f2:
    set1, set2 = {*f1.readlines()}, {*f2.readlines()}

如果我们想保留多行,我们可以使用Counter

from collections import Counter

with open(file1, 'r') as f:
    c = Counter(f.readlines())

# simple substraction won't work here if first file contains more occurences than secod
res = Counter({k: v for k, v in c.items() if k not in set2})
print(list(res.elements()))

最后,如果你也想保持秩序,你需要使用原始内容:

with open(file1, 'r') as f:
    original = f.readlines() 

res = {*original} - set2
res = [el for el in original if el not in res]