在 csv txt 或任何其他文件中按字母顺序对行进行排序

Question

我需要编写一个程序，提示用户输入任何文件名，逐行加载数据，删除任何重复行，按字母顺序对行进行排序，并将剩余行写入另一个文件。

我已经完成了大部分代码，但我很难按字母顺序对代码行进行排序。有什么建议吗？

提前感谢您的帮助！

def deleteDuplicateRecords(fileName):
    try:
        newFileName="filtered_"+fileName
        with open(fileName,'r') as readFile, open(newFileName,'w') as writeFile:
            lineSet = set()
            for line in readFile:
                if line not in lineSet: 
                    lineSet.add(line)
                    writeFile.write(line)
        readFile.close()
        writeFile.close()
        print(f"Duplicate rows removed succesfully. Open the new file '{newFileName}'")
    except FileNotFoundError:
        print("File Not Found")

name = input("Enter the name of the text file including the proper extension (.txt, .csv, etc): ")
print()

deleteDuplicateRecords(name)

Answer 1

将线条累积成一组
使用 sorted 对集合进行排序，其中 returns 是一个列表
然后将行写入输出文件

顺便说一句，使用 with-statement 意味着您不需要手动关闭文件。

Answer 2

您可以阅读此文档以了解所有排序方法。 https://docs.python.org/3/howto/sorting.html 在这 return 排序到你的输出之后 .

Answer 3

这应该有效：

def deleteDuplicateRecords(fileName):
    try:
        newFileName="filtered_"+fileName
        with open(fileName,'r') as readFile, open(newFileName,'w') as writeFile:
            for sorted_line in sorted({line for line in readFile}):
                writeFile.write(sorted_line)
        print(f"Duplicate rows removed successfully. Open the new file '{newFileName}'")
    except FileNotFoundError:
        print("File Not Found")

一些注意事项：

正如其他人所提到的，如果您使用 with，这将在范围结束时自动关闭文件。
如果您使用集合，则不会将重复的项目添加到列表中，因此没有理由检查该项目是否已存在于列表中
如果你有一个可迭代对象（列表、集合等），你可以简单地使用 sorted 函数对结果进行排序
行{line for line in readFile}使用set comprehension以更简洁的方式从文件的行创建集合。

在 csv txt 或任何其他文件中按字母顺序对行进行排序

Sorts the lines alphabetically in a csv txt or any other file

python

file-io