Python - 如何减少列表并保留值?

Python - how to reduce a list and keep values?

我有一个看起来像这样的列表:

[[12, 0.029], [12, 0.039], [12, 0.012], ...some hundreds more... [13, 0.04], [13, 0.01], ...]

第一个值的范围是 3 到 15,总计约 3000 个值

对于箱线图,我需要一个解决方案,其中为每个第一个值和所有第二个值创建一个箱线图。喜欢:

data_to_plot = [ all second values of list with value 12], [all second values of list with value 13],... 

看起来像:

data_to_plot = [0.029, 0.039], [0.04, 0.01],...

谢谢!

您似乎想要一个字典,其中第一个值作为键,第二个值作为列表的元素。你可以这样做:

data_dict = {}
for key, value in list:
    if key not in data_dict:
        data_dict.update({key: [value]})
    else:
        data_dict[key].append(value)

这将产生(使用您的示例数据){12: [0.029, 0.039], 13: [0.04, 0.01]}

使用itertools.groupby。假设您的数据(列表的列表)按第一个值排序。

import itertools
import operator

lists = [[12, 0.029], [12, 0.039], [12, 0.052], [13, 0.04], [13, 0.01], [13, 0.066]]

data_to_plot = list()
for name, group in itertools.groupby(lists, key=operator.itemgetter(0)):
    data_to_plot.append(map(operator.itemgetter(1), group))

print(data_to_plot)
# Output
[[0.029, 0.039, 0.052], [0.04, 0.01, 0.066]]

如果子列表没有预排序,您应该在 groupby,

之前对它们进行排序
data_to_plot = list()
for name, group in itertools.groupby(sorted(lists), key=operator.itemgetter(0)):
    data_to_plot.append(map(operator.itemgetter(1), group))

print(data_to_plot)
# Output
[[0.029, 0.039, 0.052], [0.01, 0.04, 0.066]]

使用默认为 listdefaultdict,因此您无需检查密钥是否已存在。然后使用第一项作为键收集值:

from collections import defaultdict

result = defaultdict(list)

lst = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
for l in lst:
    result[l[0]].append(l[1])

print(list(result.values()))
# [[0.029, 0.039], [0.04, 0.01]]

这样,你还是知道哪个values属于哪个keys

data_to_plot = result.values()
keys_for_data = result.keys()

您可以使用迭代器(但成对的列表应该具有偶数长度):

data = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
iter_second = (x[1] for x in data)
#py2
data_to_plot = zip(*([iter_second]*2))
#py3
data_to_plot = tuple(zip(*([iter_second]*2)))

But here can be n values starting with 12

data = [[12, 0.029], [12, 0.039], [12, 0.012], [13, 0.04], [13, 0.01]]
d = collections.defaultdict(list)
for key, val in data:
    d[key].append(val)
# if you need same order as in `data` use OrderedDict with setdefault method
data_to_plot = d.values()
# Output: [[0.029, 0.039, 0.012], [0.04, 0.01]]

类似于(其他答案不知道您只想拥有第一个值为 12 的数据点):

data = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]

items = []
points = [point[1] for point in data if point[0] == 12]
for i in range(0, len(points), 2):
    try:
        items.append([points[i], points[i+1]])
    except IndexError:
        pass

print items
# [[0.029, 0.039]]

另外,数据点数为奇数的列表怎么办?