通过 python 根据其他文件的内容编辑文本文件
editing text file based on the content from other file via python
我在一个目录下有一些wav文件,像这样:
- BAC009S0177W0368.wav
- BAC009S0231W0262.wav
- BAC009S0517W0431.wav
- BAC009S0002W0131.wav
- ...
同时,我有一个文本文件:
- BAC009S0002W0122|测试 1
- BAC009S0002W0123|测试2
- BAC009S0517W0431|测试3
- BAC009S0002W0131|测试4
- ...
我如何创建另一个 文本文件,其中包含基于那些 wav 文件的 文件名 的内容 仅通过 python,例如:
- BAC009S0517W0431|测试3
- BAC009S0002W0131|测试4
- ...
感谢@zwer
很抱歉忘记 post 我的尝试代码。 :(
这是:
import os
wav_path = "/home/user/wav_files" # wav files directory
txt_path = "/home/user/text_file" # BAC009S0002W0122|testing1
output_path = "/home/user/output_text_file"
standard = []
for root, dirs, files in os.walk(wav_path):
for index, filename in enumerate(files):
standard.append(filename[:-4])
# print(len(standard))
# print(standard)
test = []
with open(txt_path, 'r', encoding='utf-8') as infile:
for line in infile:
parts = line.strip().split('|')
test.append(parts[0])
# print(test)
correct = set(standard) & set(test)
correct = list(correct)
# print(correct)
# print(len(correct))
# print(type(correct))
现在,我可以从文本文件中提取出正确的内容行并仅保存为列表类型。 :(
您可以通过三个简单的步骤完成此操作 - 首先列出目录中的所有 *.wav
文件并去除它们的扩展名,然后遍历文本文件的行并查看第一个 [=12= 之前的字符串] 存在于第一步的文件列表中 - 如果存在,请在输出文件中写出该行,因此:
import os
wav_path = "/home/user/wav_files"
txt_path = "/home/user/text_file"
output_path = "/home/user/output_text_file"
# get the *.wav files list, sans their extension; store in a set for fast lookup
wav_files = {f[:-4] for f in os.listdir(wav_path) if f[-4:] == ".wav" and os.path.isfile(f)}
# open the `txt_path` for reading and `output_path` for writing
with open(txt_path, "r") as f_in, open(output_path, "w") as f_out:
for line in f_in: # iterate the text file line by line
if line.split("|", 1)[0] in wav_files: # if present in the file list...
f_out.write(line) # write the line to the output file
我在一个目录下有一些wav文件,像这样:
- BAC009S0177W0368.wav
- BAC009S0231W0262.wav
- BAC009S0517W0431.wav
- BAC009S0002W0131.wav
- ...
同时,我有一个文本文件:
- BAC009S0002W0122|测试 1
- BAC009S0002W0123|测试2
- BAC009S0517W0431|测试3
- BAC009S0002W0131|测试4
- ...
我如何创建另一个 文本文件,其中包含基于那些 wav 文件的 文件名 的内容 仅通过 python,例如:
- BAC009S0517W0431|测试3
- BAC009S0002W0131|测试4
- ...
感谢@zwer 很抱歉忘记 post 我的尝试代码。 :( 这是:
import os
wav_path = "/home/user/wav_files" # wav files directory
txt_path = "/home/user/text_file" # BAC009S0002W0122|testing1
output_path = "/home/user/output_text_file"
standard = []
for root, dirs, files in os.walk(wav_path):
for index, filename in enumerate(files):
standard.append(filename[:-4])
# print(len(standard))
# print(standard)
test = []
with open(txt_path, 'r', encoding='utf-8') as infile:
for line in infile:
parts = line.strip().split('|')
test.append(parts[0])
# print(test)
correct = set(standard) & set(test)
correct = list(correct)
# print(correct)
# print(len(correct))
# print(type(correct))
现在,我可以从文本文件中提取出正确的内容行并仅保存为列表类型。 :(
您可以通过三个简单的步骤完成此操作 - 首先列出目录中的所有 *.wav
文件并去除它们的扩展名,然后遍历文本文件的行并查看第一个 [=12= 之前的字符串] 存在于第一步的文件列表中 - 如果存在,请在输出文件中写出该行,因此:
import os
wav_path = "/home/user/wav_files"
txt_path = "/home/user/text_file"
output_path = "/home/user/output_text_file"
# get the *.wav files list, sans their extension; store in a set for fast lookup
wav_files = {f[:-4] for f in os.listdir(wav_path) if f[-4:] == ".wav" and os.path.isfile(f)}
# open the `txt_path` for reading and `output_path` for writing
with open(txt_path, "r") as f_in, open(output_path, "w") as f_out:
for line in f_in: # iterate the text file line by line
if line.split("|", 1)[0] in wav_files: # if present in the file list...
f_out.write(line) # write the line to the output file