通过 python 根据其他文件的内容编辑文本文件

Question

我在一个目录下有一些wav文件，像这样：

BAC009S0177W0368.wav
BAC009S0231W0262.wav
BAC009S0517W0431.wav
BAC009S0002W0131.wav
...

同时，我有一个文本文件：

BAC009S0002W0122|测试 1
BAC009S0002W0123|测试2
BAC009S0517W0431|测试3
BAC009S0002W0131|测试4
...

我如何创建另一个 文本文件，其中包含基于那些 wav 文件的 文件名 的内容 仅通过 python，例如：

BAC009S0517W0431|测试3
BAC009S0002W0131|测试4
...

感谢@zwer 很抱歉忘记 post 我的尝试代码。 :( 这是：

import os

wav_path = "/home/user/wav_files" # wav files directory
txt_path = "/home/user/text_file" # BAC009S0002W0122|testing1
output_path = "/home/user/output_text_file"

standard = []
for root, dirs, files in os.walk(wav_path):
    for index, filename in enumerate(files):
        standard.append(filename[:-4])
# print(len(standard))
# print(standard)

test = []
with open(txt_path, 'r', encoding='utf-8') as infile:
    for line in infile:
        parts = line.strip().split('|')
        test.append(parts[0])
        # print(test)

correct = set(standard) & set(test)
correct = list(correct)
# print(correct)
# print(len(correct))
# print(type(correct))

现在，我可以从文本文件中提取出正确的内容行并仅保存为列表类型。 :(

Answer 1

您可以通过三个简单的步骤完成此操作 - 首先列出目录中的所有 *.wav 文件并去除它们的扩展名，然后遍历文本文件的行并查看第一个 [=12= 之前的字符串] 存在于第一步的文件列表中 - 如果存在，请在输出文件中写出该行，因此：

import os

wav_path = "/home/user/wav_files"
txt_path = "/home/user/text_file"
output_path = "/home/user/output_text_file"

# get the *.wav files list, sans their extension; store in a set for fast lookup
wav_files = {f[:-4] for f in os.listdir(wav_path) if f[-4:] == ".wav" and os.path.isfile(f)}

# open the `txt_path` for reading and `output_path` for writing
with open(txt_path, "r") as f_in, open(output_path, "w") as f_out:
    for line in f_in:  # iterate the text file line by line
        if line.split("|", 1)[0] in wav_files:  # if present in the file list...
            f_out.write(line)  # write the line to the output file

通过 python 根据其他文件的内容编辑文本文件

editing text file based on the content from other file via python

python

filenames

text

extract