通过 python 根据其他文件的内容编辑文本文件

editing text file based on the content from other file via python

我在一个目录下有一些wav文件,像这样:

  1. BAC009S0177W0368.wav
  2. BAC009S0231W0262.wav
  3. BAC009S0517W0431.wav
  4. BAC009S0002W0131.wav
  5. ...

同时,我有一个文本文件:

  1. BAC009S0002W0122|测试 1
  2. BAC009S0002W0123|测试2
  3. BAC009S0517W0431|测试3
  4. BAC009S0002W0131|测试4
  5. ...

我如何创建另一个 文本文件,其中包含基于那些 wav 文件的 文件名 的内容 仅通过 python,例如:

  1. BAC009S0517W0431|测试3
  2. BAC009S0002W0131|测试4
  3. ...

感谢@zwer 很抱歉忘记 post 我的尝试代码。 :( 这是:

import os

wav_path = "/home/user/wav_files" # wav files directory
txt_path = "/home/user/text_file" # BAC009S0002W0122|testing1
output_path = "/home/user/output_text_file"

standard = []
for root, dirs, files in os.walk(wav_path):
    for index, filename in enumerate(files):
        standard.append(filename[:-4])
# print(len(standard))
# print(standard)

test = []
with open(txt_path, 'r', encoding='utf-8') as infile:
    for line in infile:
        parts = line.strip().split('|')
        test.append(parts[0])
        # print(test)

correct = set(standard) & set(test)
correct = list(correct)
# print(correct)
# print(len(correct))
# print(type(correct))

现在,我可以从文本文件中提取出正确的内容行并仅保存为列表类型。 :(

您可以通过三个简单的步骤完成此操作 - 首先列出目录中的所有 *.wav 文件并去除它们的扩展名,然后遍历文本文件的行并查看第一个 [=12= 之前的字符串] 存在于第一步的文件列表中 - 如果存在,请在输出文件中写出该行,因此:

import os

wav_path = "/home/user/wav_files"
txt_path = "/home/user/text_file"
output_path = "/home/user/output_text_file"

# get the *.wav files list, sans their extension; store in a set for fast lookup
wav_files = {f[:-4] for f in os.listdir(wav_path) if f[-4:] == ".wav" and os.path.isfile(f)}

# open the `txt_path` for reading and `output_path` for writing
with open(txt_path, "r") as f_in, open(output_path, "w") as f_out:
    for line in f_in:  # iterate the text file line by line
        if line.split("|", 1)[0] in wav_files:  # if present in the file list...
            f_out.write(line)  # write the line to the output file