将多个输入文件的 python 脚本修改为 运行

Modify python script to run for multiple input files

我是 python 的新手,我有一个 python 脚本到 运行 用于特定文件 (input1.txt) 并生成输出 (output1.fasta), 但我想 运行 这个脚本用于多个文件, 例如: input2.txt, input3.txt... 并生成相应的输出: output2.fasta, output3.fasta

from Bio import SeqIO

fasta_file = "sequences.txt" 
wanted_file = "input1.txt" 
result_file = "output1.fasta" 

wanted = set()
with open(wanted_file) as f:
    for line in f:
        line = line.strip()
        if line != "":
            wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
    for seq in fasta_sequences:
        if seq.id in wanted:
            SeqIO.write([seq], f, "fasta")

我尝试添加了glob功能,但我不知道如何处理输出文件名。

from Bio import SeqIO
import glob

fasta_file = "sequences.txt"

for filename in glob.glob('*.txt'):

    wanted = set()
    with open(filename) as f:
        for line in f:
            line = line.strip()
            if line != "":
                wanted.add(line)

    fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
    with open(result_file, "w") as f:
        for seq in fasta_sequences:
            if seq.id in wanted:
                SeqIO.write([seq], f, "fasta")

错误信息是:NameError: name 'result_file' is not defined

您的 glob 当前正在提取您的 "sequences" 文件以及输入,因为 *.txt 包含 sequences.txt 文件。如果 "fasta" 文件总是相同的,而你只想迭代输入文件,那么你需要

for filename in glob.glob('input*.txt'):

此外,要遍历整个过程,也许您想将它放在一个方法中。如果始终创建输出文件名以对应于输入,那么您可以动态创建它。

from Bio import SeqIO

def create_fasta_outputs(fasta_file, wanted_file):
    result_file = wanted_file.replace("input","output").replace(".txt",".fasta")

    wanted = set()
    with open(wanted_file) as f:
        for line in f:
            line = line.strip()
            if line != "":
                wanted.add(line)
    fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
    with open(result_file, "w") as f:
        for seq in fasta_sequences:
            if seq.id in wanted:
                SeqIO.write([seq], f, "fasta")

fasta_file = "sequences.txt"
for wanted_file in glob.glob('input*.txt'):
    create_fasta_outputs(fasta_file, wanted_file)