Python - 如何减少列表并保留值?
Python - how to reduce a list and keep values?
我有一个看起来像这样的列表:
[[12, 0.029], [12, 0.039], [12, 0.012], ...some hundreds more... [13, 0.04], [13, 0.01], ...]
第一个值的范围是 3 到 15,总计约 3000 个值
对于箱线图,我需要一个解决方案,其中为每个第一个值和所有第二个值创建一个箱线图。喜欢:
data_to_plot = [ all second values of list with value 12], [all second values of list with value 13],...
看起来像:
data_to_plot = [0.029, 0.039], [0.04, 0.01],...
谢谢!
您似乎想要一个字典,其中第一个值作为键,第二个值作为列表的元素。你可以这样做:
data_dict = {}
for key, value in list:
if key not in data_dict:
data_dict.update({key: [value]})
else:
data_dict[key].append(value)
这将产生(使用您的示例数据){12: [0.029, 0.039], 13: [0.04, 0.01]}
使用itertools.groupby
。假设您的数据(列表的列表)按第一个值排序。
import itertools
import operator
lists = [[12, 0.029], [12, 0.039], [12, 0.052], [13, 0.04], [13, 0.01], [13, 0.066]]
data_to_plot = list()
for name, group in itertools.groupby(lists, key=operator.itemgetter(0)):
data_to_plot.append(map(operator.itemgetter(1), group))
print(data_to_plot)
# Output
[[0.029, 0.039, 0.052], [0.04, 0.01, 0.066]]
如果子列表没有预排序,您应该在 groupby
,
之前对它们进行排序
data_to_plot = list()
for name, group in itertools.groupby(sorted(lists), key=operator.itemgetter(0)):
data_to_plot.append(map(operator.itemgetter(1), group))
print(data_to_plot)
# Output
[[0.029, 0.039, 0.052], [0.01, 0.04, 0.066]]
使用默认为 list
的 defaultdict
,因此您无需检查密钥是否已存在。然后使用第一项作为键收集值:
from collections import defaultdict
result = defaultdict(list)
lst = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
for l in lst:
result[l[0]].append(l[1])
print(list(result.values()))
# [[0.029, 0.039], [0.04, 0.01]]
这样,你还是知道哪个values
属于哪个keys
。
data_to_plot = result.values()
keys_for_data = result.keys()
您可以使用迭代器(但成对的列表应该具有偶数长度):
data = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
iter_second = (x[1] for x in data)
#py2
data_to_plot = zip(*([iter_second]*2))
#py3
data_to_plot = tuple(zip(*([iter_second]*2)))
But here can be n values starting with 12
data = [[12, 0.029], [12, 0.039], [12, 0.012], [13, 0.04], [13, 0.01]]
d = collections.defaultdict(list)
for key, val in data:
d[key].append(val)
# if you need same order as in `data` use OrderedDict with setdefault method
data_to_plot = d.values()
# Output: [[0.029, 0.039, 0.012], [0.04, 0.01]]
类似于(其他答案不知道您只想拥有第一个值为 12 的数据点):
data = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
items = []
points = [point[1] for point in data if point[0] == 12]
for i in range(0, len(points), 2):
try:
items.append([points[i], points[i+1]])
except IndexError:
pass
print items
# [[0.029, 0.039]]
另外,数据点数为奇数的列表怎么办?
我有一个看起来像这样的列表:
[[12, 0.029], [12, 0.039], [12, 0.012], ...some hundreds more... [13, 0.04], [13, 0.01], ...]
第一个值的范围是 3 到 15,总计约 3000 个值
对于箱线图,我需要一个解决方案,其中为每个第一个值和所有第二个值创建一个箱线图。喜欢:
data_to_plot = [ all second values of list with value 12], [all second values of list with value 13],...
看起来像:
data_to_plot = [0.029, 0.039], [0.04, 0.01],...
谢谢!
您似乎想要一个字典,其中第一个值作为键,第二个值作为列表的元素。你可以这样做:
data_dict = {}
for key, value in list:
if key not in data_dict:
data_dict.update({key: [value]})
else:
data_dict[key].append(value)
这将产生(使用您的示例数据){12: [0.029, 0.039], 13: [0.04, 0.01]}
使用itertools.groupby
。假设您的数据(列表的列表)按第一个值排序。
import itertools
import operator
lists = [[12, 0.029], [12, 0.039], [12, 0.052], [13, 0.04], [13, 0.01], [13, 0.066]]
data_to_plot = list()
for name, group in itertools.groupby(lists, key=operator.itemgetter(0)):
data_to_plot.append(map(operator.itemgetter(1), group))
print(data_to_plot)
# Output
[[0.029, 0.039, 0.052], [0.04, 0.01, 0.066]]
如果子列表没有预排序,您应该在 groupby
,
data_to_plot = list()
for name, group in itertools.groupby(sorted(lists), key=operator.itemgetter(0)):
data_to_plot.append(map(operator.itemgetter(1), group))
print(data_to_plot)
# Output
[[0.029, 0.039, 0.052], [0.01, 0.04, 0.066]]
使用默认为 list
的 defaultdict
,因此您无需检查密钥是否已存在。然后使用第一项作为键收集值:
from collections import defaultdict
result = defaultdict(list)
lst = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
for l in lst:
result[l[0]].append(l[1])
print(list(result.values()))
# [[0.029, 0.039], [0.04, 0.01]]
这样,你还是知道哪个values
属于哪个keys
。
data_to_plot = result.values()
keys_for_data = result.keys()
您可以使用迭代器(但成对的列表应该具有偶数长度):
data = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
iter_second = (x[1] for x in data)
#py2
data_to_plot = zip(*([iter_second]*2))
#py3
data_to_plot = tuple(zip(*([iter_second]*2)))
But here can be n values starting with 12
data = [[12, 0.029], [12, 0.039], [12, 0.012], [13, 0.04], [13, 0.01]]
d = collections.defaultdict(list)
for key, val in data:
d[key].append(val)
# if you need same order as in `data` use OrderedDict with setdefault method
data_to_plot = d.values()
# Output: [[0.029, 0.039, 0.012], [0.04, 0.01]]
类似于(其他答案不知道您只想拥有第一个值为 12 的数据点):
data = [[12, 0.029], [12, 0.039], [13, 0.04], [13, 0.01]]
items = []
points = [point[1] for point in data if point[0] == 12]
for i in range(0, len(points), 2):
try:
items.append([points[i], points[i+1]])
except IndexError:
pass
print items
# [[0.029, 0.039]]
另外,数据点数为奇数的列表怎么办?