Biopython:不能将 .count() 用于 biopython

Biopython: Cant use .count() for biopython

我的目标是接收 'g' 出现在 DNA 序列中的时间。

我使用列表推导

通过 Biopython 导入了一个 DNA 序列
seq = [record for record in SeqIO.parse('sequences/hiv.gbk.rtf', 'fasta')]

然后我尝试在新创建的列表补偿变量上使用 .count() 方法

print(seq.count('g'))

我收到一条错误消息

NotImplementedError: SeqRecord comparison is deliberately not implemented. Explicitly compare the attributes of interest.

有人知道 dealio 是什么吗? Biopython 的手册说所有标准 python 方法都应该有效。

您正在尝试将 count 应用于列表。您需要将它应用于每个元素的序列,例如

print(seq[0].seq.count('g'))

或者如果你想获得所有序列的总和

print(sum([s.seq.count('g') for s in seq]))

这是一个最小的工作示例

from Bio import SeqIO

txt = """>gnl|TC-DB|O60669|2.A.1.13.5 Monocarboxylate transporter 2 - Homo sapiens (Human).
MPPMPSAPPVHPPPDGGWGWIVVGAAFISIGFSYAFPKAVTVFFKEIQQIFHTTYSEIAW
>gnl|TC-DB|O60706|3.A.1.208.23 ATP-binding cassette sub-family C member 9 OS=Homo sapiens GN=ABCC9 PE=1 SV=2
MSLSFCGNNISSYNINDGVLQNSCFVDALNLVPHVFLLFITFPILFIGWGSQSSKVQIHH
>gnl|TC-DB|O60721|3.A.1.208.23 Sodium/potassium/calcium exchanger 1 OS=Homo sapiens GN=SLC24A1 PE=1 SV=1
MGKLIRMGPQERWLLRTKRLHWSRLLFLLGMLIIGSTYQHLRRPRGLSSLWAAVSSHQPI
>gnl|TC-DB|O60779|2.A.1.13.5 Thiamine transporter 1 (THTR-1) (ThTr1) (Thiamine carrier 1) (TC1) - Homo sapiens (Human).
MDVPGPVSRRAAAAAATVLLRTARVRRECWFLPTALLCAYGFFASLRPSEPFLTPYLLGP"""

filename = 'sequences.fa'
with open(filename, 'w') as f:
    f.write(txt)

seqs = [record for record in SeqIO.parse(filename, 'fasta')]

print(sum([s.seq.count('P') for s in seqs]))    
>>> 21

print(seqs[0].seq.count('P'))
>>> 9