Python 忽略具有多个模式匹配的行
Python ignore lines with multiple pattern match
我有如下列表:
Index1_list=['ATTACTCG','TCCGGAGA','CGCTCATT','GAGATTCC','ATTCAGAA']
我想做的是在行中只有一个列表元素时保存行 (not two different or three different)
>seq1
NNNNNNNNNNNNNNNNATTACTCGNNNNNNNNNNNGAGATTCCNNNNN
>seq2
NNNNNNNNNNNNNATTACTCGNNNNNNNNNN
>seq3
NNNNNNNNNNNNNGAGATTCCNNNNNNNNNNN
输出应该是
>seq2
NNNNNNNNNNNNNATTACTCGNNNNNNNNNN
>seq3
NNNNNNNNNNNNNGAGATTCCNNNNNNNNNNN
我使用了下面的脚本,但未能过滤掉具有两个不同匹配项的读数。
from Bio import SeqIO
Index1_list=['ATTACTCG','TCCGGAGA','CGCTCATT','GAGATTCC','ATTCAGAA']
with open('All.fastq','r') as R1:
for record in SeqIO.parse(R1,'fasta'):
for i in Index1_list:
if i in record.seq:
sequences = record.format('fasta')
print(sequences)
谢谢。
您应该能够通过检查列表中有多少元素在所需的字符串中来执行您想要的操作,如下所示:
from Bio import SeqIO
Index1_list=['ATTACTCG','TCCGGAGA','CGCTCATT','GAGATTCC','ATTCAGAA']
with open('All.fastq','r') as R1:
for record in SeqIO.parse(R1,'fasta'):
count = 0
for i in Index1_list:
if i in record.seq:
count += 1
if count == 1:
sequences = record.format('fasta')
print(sequences)
我有如下列表:
Index1_list=['ATTACTCG','TCCGGAGA','CGCTCATT','GAGATTCC','ATTCAGAA']
我想做的是在行中只有一个列表元素时保存行 (not two different or three different)
>seq1
NNNNNNNNNNNNNNNNATTACTCGNNNNNNNNNNNGAGATTCCNNNNN
>seq2
NNNNNNNNNNNNNATTACTCGNNNNNNNNNN
>seq3
NNNNNNNNNNNNNGAGATTCCNNNNNNNNNNN
输出应该是
>seq2
NNNNNNNNNNNNNATTACTCGNNNNNNNNNN
>seq3
NNNNNNNNNNNNNGAGATTCCNNNNNNNNNNN
我使用了下面的脚本,但未能过滤掉具有两个不同匹配项的读数。
from Bio import SeqIO
Index1_list=['ATTACTCG','TCCGGAGA','CGCTCATT','GAGATTCC','ATTCAGAA']
with open('All.fastq','r') as R1:
for record in SeqIO.parse(R1,'fasta'):
for i in Index1_list:
if i in record.seq:
sequences = record.format('fasta')
print(sequences)
谢谢。
您应该能够通过检查列表中有多少元素在所需的字符串中来执行您想要的操作,如下所示:
from Bio import SeqIO
Index1_list=['ATTACTCG','TCCGGAGA','CGCTCATT','GAGATTCC','ATTCAGAA']
with open('All.fastq','r') as R1:
for record in SeqIO.parse(R1,'fasta'):
count = 0
for i in Index1_list:
if i in record.seq:
count += 1
if count == 1:
sequences = record.format('fasta')
print(sequences)