在 python 中使用元组循环提取数据

extract data using loop over tuple in python

我有以下元组,其中包含分子编号 (MolNum) 和来自某个参考点的相应 distance。分子按照distances升序排列。我可以将 MolNumdistances 提取为两个单独的列表。但是,我想通过满足以下条件 if 10 < distance < 100 来获取 g 的元素;所以我会得到 gg。我怎么能得到这个?

g = [(MolNum(378), 2.4613922385709617e-14),
 (MolNum(373), 40.6680008439399),
 (MolNum(353), 72.49296570091882),
 (MolNum(354), 83.18203548933252),
 (MolNum(359), 88.23588863972836),
 (MolNum(372), 97.47433492265824),
 (MolNum(369), 104.59206739018573),
 (MolNum(370), 114.66573137439451),
 (MolNum(361), 122.33788252133775),
 (MolNum(376), 137.2686523522959),
 (MolNum(360), 141.72521396936926),
 (MolNum(371), 145.96842598002533),
 (MolNum(352), 149.8990795114449),
 (MolNum(366), 164.55606071030496),
 (MolNum(358), 180.72531479536423),
 (MolNum(375), 182.21612213617874),
 (MolNum(364), 185.78028496680486),
 (MolNum(363), 192.02220222384793),
 (MolNum(368), 194.0298647708072),
 (MolNum(365), 194.57037736733918),
 (MolNum(356), 201.91526815811372),
 (MolNum(362), 217.8580017023349),
 (MolNum(357), 234.3818585062885),
 (MolNum(374), 241.33751568809993),
 (MolNum(367), 249.36129229747306),
 (MolNum(355), 253.59625354913504)]

满足条件后;

gg = [(MolNum(373), 40.6680008439399),
 (MolNum(353), 72.49296570091882),
 (MolNum(354), 83.18203548933252),
 (MolNum(359), 88.23588863972836),
 (MolNum(372), 97.47433492265824)] 
gg = [(mol_num, distance) for mol_num, distance in g if 10 < distance < 100]

你可以这样试试

gg = [item for item in g if 10<item[1]<100]

或者您可能会想到@Anand S Kumar 是使用 filter() 的答案,这是一种更 pythonic 的方式。

希望对您有所帮助

您可以为此使用 builtin filter function,在第一个参数中将条件作为 lambda 表达式,在第二个参数中给出要过滤的列表 -

gg = list(filter(lambda x: 10 < x[1] < 100,g))

对于 Python 2.7,您不需要 list(...) 作为过滤器 returns 列表。


在Python3.x中,filter()函数returns一个迭代器,它产生满足条件的元素(即条件returnsTrue.

在Python 2.7中,filter()函数returns满足条件的元素列表(即条件returns True .


Example/Demo -

>>> class MolNum:
...     def __init__(self, n):
...             self.n = n
...
>>> g = [(MolNum(378), 2.4613922385709617e-14),
...  (MolNum(373), 40.6680008439399),
...  (MolNum(353), 72.49296570091882),
...  (MolNum(354), 83.18203548933252),
...  (MolNum(359), 88.23588863972836),
...  (MolNum(372), 97.47433492265824),
...  (MolNum(369), 104.59206739018573),
...  (MolNum(370), 114.66573137439451),
...  (MolNum(361), 122.33788252133775),
...  (MolNum(376), 137.2686523522959),
...  (MolNum(360), 141.72521396936926),
...  (MolNum(371), 145.96842598002533),
...  (MolNum(352), 149.8990795114449),
...  (MolNum(366), 164.55606071030496),
...  (MolNum(358), 180.72531479536423),
...  (MolNum(375), 182.21612213617874),
...  (MolNum(364), 185.78028496680486),
...  (MolNum(363), 192.02220222384793),
...  (MolNum(368), 194.0298647708072),
...  (MolNum(365), 194.57037736733918),
...  (MolNum(356), 201.91526815811372),
...  (MolNum(362), 217.8580017023349),
...  (MolNum(357), 234.3818585062885),
...  (MolNum(374), 241.33751568809993),
...  (MolNum(367), 249.36129229747306),
...  (MolNum(355), 253.59625354913504)]
>>>
<filter object at 0x02302E70>
>>> gg = list(filter(lambda x: 10 < x[1] < 100,g))
>>> len(gg)
5

你可能想看看Pandas,这是一个非常常用的表格数据分析包:

import pandas as pd
g= pd.DataFrame(g)
gg = g[g[1].between(10,100)] 
gg

Out[239]: 
     0          1
1  373  40.668001
2  353  72.492966
3  354  83.182035
4  359  88.235889
5  372  97.474335