如何按值对列表进行分组?
How to group list by values?
假设我有一个这样的列表:
[['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
我怎样才能得到这个输出:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
我想做的是将 3 个元素分成一组。如果一个组不是由3个名字组成的,则不会显示。
另一个例子:
输入:
[['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
预期输出:
Evelyn Heather Evelyn
Norma Dorothy Harry
到现在为止,我只能根据每个数字(1、2、3)将他们的名字分组到子列表中。
r = [[] for i in range(3)]
for i in l:
if i[1] == 1:
r[0].append(i[0])
elif i[1] == 2:
r[1].append(i[0])
elif i[1] == 3:
r[2].append(i[0])
print r
r = [['John', 'Fred', 'Carolyn', 'Deborah', 'Marie', 'Jerry', 'Kimberly', 'Lawrence', 'Anthony', 'Rachel', 'Kathleen', 'Stephanie'], ['Dorothy', 'Joyce', 'Jonathan', 'Aaron', 'Adam', 'Kevin', 'Alice', 'Louis', 'Edward', 'Gerald', 'Donna'], ['Kenneth', 'Ronald', 'Julia', 'Carolyn', 'Samuel', 'Fred', 'Fred', 'Keith', 'Matthew']]
我认为最简单的答案是使用字典根据您拥有的数字(这将是字典键)收集结果。然后可以按长度筛选字典中保存的结果:
In [7]: from collections import defaultdict
In [8]: results = defaultdict(list)
In [9]: name_list = [['bob', 1], ['cindy', 1], ['ted', 2]]
In [10]: for (value, key) in name_list:
...: results[key].append(value)
...:
In [11]: results
Out[11]: defaultdict(list, {1: ['bob', 'cindy'], 2: ['ted']})
In [13]: for key in results:
...: if len(results.get(key)) == 2:
...: print( 'found a result of length 2: ', results.get(key))
...:
found a result of length 2: ['bob', 'cindy']
这是一个类似于 this 问题的方法:
l1 = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
l2 = [['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
def get_exp(l):
v = set(map(lambda x:x[1], l))
nl = [[y[0] for y in l if y[1]==x] for x in v]
return '\n'.join(list(map(' '.join, zip(*nl))))
output_l1 = get_exp(l1)
output_l2 = get_exp(l2)
output_l1 :
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
output_l2 :
Evelyn Heather Evelyn
Norma Dorothy Harry
lst1 = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
lst2 = [['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
from itertools import groupby
def my_print(lst):
d = {v: list(g) for v, g in groupby(sorted(lst, key=lambda k: k[-1]), lambda v: v[-1])}
while True:
try:
i1 = d[1].pop(0)
i2 = d[2].pop(0)
i3 = d[3].pop(0)
print('{} {} {}'.format(i1[0], i2[0], i3[0]))
except IndexError:
break
my_print(lst1)
print('*' * 80)
my_print(lst2)
打印:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
********************************************************************************
Evelyn Heather Evelyn
Norma Dorothy Harry
怎么样:
l = [['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
from itertools import groupby
def keyfunc(arr) :
return arr[1]
l = sorted(l, key=keyfunc)
s =[[*x,] for i,x in groupby(data , keyfunc)]
combinations = [*zip(*s),]
然后您可以通过执行以下操作打印出元素:
for l in combinations :
print(' '.join([x[0] for x in l]))
打印出来:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
您可以结合使用 zip
和 itertools.groupby
来非常简洁地完成此操作。首先,按数字对列表进行排序,然后进行分组和压缩。如果你想要字符串,你可以加入:
from operator import itemgetter
from itertools import groupby
l = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
l.sort(key = itemgetter(1))
groups = zip(*([name for name, g in n] for k, n in groupby(l, itemgetter(1))))
[" ".join(names) for names in groups]
输出:
['John Dorothy Kenneth',
'Fred Joyce Ronald',
'Carolyn Jonathan Julia',
'Deborah Aaron Carolyn',
'Marie Adam Samuel',
'Jerry Kevin Fred',
'Kimberly Alice Fred',
'Lawrence Louis Keith',
'Anthony Edward Matthew']
按第二个元素将名称分组到字典中,然后将它们压缩在一起。
def gum(l):
g = {}
for n, k in l:
g.setdefault(k, []).append(n)
return zip(*g.values())
l1 = [['John', 1], ['Fred', 1], ['Carolyn', 1],
['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2],
['Joyce', 2], ['Julia', 3], ['Deborah', 1],
['Jonathan', 2], ['Aaron', 2], ['Marie', 1],
['Adam', 2], ['Kevin', 2], ['Alice', 2],
['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1],
['Louis', 2], ['Anthony', 1], ['Carolyn', 3],
['Edward', 2], ['Samuel', 3], ['Rachel', 1],
['Kathleen', 1], ['Fred', 3], ['Fred', 3],
['Gerald', 2], ['Donna', 2], ['Keith', 3],
['Matthew', 3], ['Stephanie', 1]]
l2 = [['Heather', 2], ['Evelyn', 1], ['Norma', 1],
['Evelyn', 3], ['Harry', 3], ['Sean', 1],
['Anna', 1], ['Jerry', 3], ['Anna', 3],
['Julia', 1], ['Dorothy', 2]]
print '\n\n'.join('\n'.join(' '.join(n) for n in l) for l in [gum(l1), gum(l2)])
输出:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
Evelyn Heather Evelyn
Norma Dorothy Harry
使用 numpy
是另一种方法:
import math
import numpy as np
data = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
array = np.array([x[0] for x in data])
array = np.resize(array,(3,math.ceil(len(array)/3))).T
[" ".join(x) for x in array]
输出:
['John Marie Samuel',
'Fred Adam Rachel',
'Carolyn Kevin Kathleen',
'Kenneth Alice Fred',
'Ronald Jerry Fred',
'Dorothy Kimberly Gerald',
'Joyce Lawrence Donna',
'Julia Louis Keith',
'Deborah Anthony Matthew',
'Jonathan Carolyn Stephanie',
'Aaron Edward John']
假设我有一个这样的列表:
[['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
我怎样才能得到这个输出:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
我想做的是将 3 个元素分成一组。如果一个组不是由3个名字组成的,则不会显示。
另一个例子:
输入:
[['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
预期输出:
Evelyn Heather Evelyn
Norma Dorothy Harry
到现在为止,我只能根据每个数字(1、2、3)将他们的名字分组到子列表中。
r = [[] for i in range(3)]
for i in l:
if i[1] == 1:
r[0].append(i[0])
elif i[1] == 2:
r[1].append(i[0])
elif i[1] == 3:
r[2].append(i[0])
print r
r = [['John', 'Fred', 'Carolyn', 'Deborah', 'Marie', 'Jerry', 'Kimberly', 'Lawrence', 'Anthony', 'Rachel', 'Kathleen', 'Stephanie'], ['Dorothy', 'Joyce', 'Jonathan', 'Aaron', 'Adam', 'Kevin', 'Alice', 'Louis', 'Edward', 'Gerald', 'Donna'], ['Kenneth', 'Ronald', 'Julia', 'Carolyn', 'Samuel', 'Fred', 'Fred', 'Keith', 'Matthew']]
我认为最简单的答案是使用字典根据您拥有的数字(这将是字典键)收集结果。然后可以按长度筛选字典中保存的结果:
In [7]: from collections import defaultdict
In [8]: results = defaultdict(list)
In [9]: name_list = [['bob', 1], ['cindy', 1], ['ted', 2]]
In [10]: for (value, key) in name_list:
...: results[key].append(value)
...:
In [11]: results
Out[11]: defaultdict(list, {1: ['bob', 'cindy'], 2: ['ted']})
In [13]: for key in results:
...: if len(results.get(key)) == 2:
...: print( 'found a result of length 2: ', results.get(key))
...:
found a result of length 2: ['bob', 'cindy']
这是一个类似于 this 问题的方法:
l1 = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
l2 = [['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
def get_exp(l):
v = set(map(lambda x:x[1], l))
nl = [[y[0] for y in l if y[1]==x] for x in v]
return '\n'.join(list(map(' '.join, zip(*nl))))
output_l1 = get_exp(l1)
output_l2 = get_exp(l2)
output_l1 :
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
output_l2 :
Evelyn Heather Evelyn
Norma Dorothy Harry
lst1 = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
lst2 = [['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
from itertools import groupby
def my_print(lst):
d = {v: list(g) for v, g in groupby(sorted(lst, key=lambda k: k[-1]), lambda v: v[-1])}
while True:
try:
i1 = d[1].pop(0)
i2 = d[2].pop(0)
i3 = d[3].pop(0)
print('{} {} {}'.format(i1[0], i2[0], i3[0]))
except IndexError:
break
my_print(lst1)
print('*' * 80)
my_print(lst2)
打印:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
********************************************************************************
Evelyn Heather Evelyn
Norma Dorothy Harry
怎么样:
l = [['Heather', 2], ['Evelyn', 1], ['Norma', 1], ['Evelyn', 3], ['Harry', 3], ['Sean', 1], ['Anna', 1], ['Jerry', 3], ['Anna', 3], ['Julia', 1], ['Dorothy', 2]]
from itertools import groupby
def keyfunc(arr) :
return arr[1]
l = sorted(l, key=keyfunc)
s =[[*x,] for i,x in groupby(data , keyfunc)]
combinations = [*zip(*s),]
然后您可以通过执行以下操作打印出元素:
for l in combinations :
print(' '.join([x[0] for x in l]))
打印出来:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
您可以结合使用 zip
和 itertools.groupby
来非常简洁地完成此操作。首先,按数字对列表进行排序,然后进行分组和压缩。如果你想要字符串,你可以加入:
from operator import itemgetter
from itertools import groupby
l = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
l.sort(key = itemgetter(1))
groups = zip(*([name for name, g in n] for k, n in groupby(l, itemgetter(1))))
[" ".join(names) for names in groups]
输出:
['John Dorothy Kenneth',
'Fred Joyce Ronald',
'Carolyn Jonathan Julia',
'Deborah Aaron Carolyn',
'Marie Adam Samuel',
'Jerry Kevin Fred',
'Kimberly Alice Fred',
'Lawrence Louis Keith',
'Anthony Edward Matthew']
按第二个元素将名称分组到字典中,然后将它们压缩在一起。
def gum(l):
g = {}
for n, k in l:
g.setdefault(k, []).append(n)
return zip(*g.values())
l1 = [['John', 1], ['Fred', 1], ['Carolyn', 1],
['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2],
['Joyce', 2], ['Julia', 3], ['Deborah', 1],
['Jonathan', 2], ['Aaron', 2], ['Marie', 1],
['Adam', 2], ['Kevin', 2], ['Alice', 2],
['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1],
['Louis', 2], ['Anthony', 1], ['Carolyn', 3],
['Edward', 2], ['Samuel', 3], ['Rachel', 1],
['Kathleen', 1], ['Fred', 3], ['Fred', 3],
['Gerald', 2], ['Donna', 2], ['Keith', 3],
['Matthew', 3], ['Stephanie', 1]]
l2 = [['Heather', 2], ['Evelyn', 1], ['Norma', 1],
['Evelyn', 3], ['Harry', 3], ['Sean', 1],
['Anna', 1], ['Jerry', 3], ['Anna', 3],
['Julia', 1], ['Dorothy', 2]]
print '\n\n'.join('\n'.join(' '.join(n) for n in l) for l in [gum(l1), gum(l2)])
输出:
John Dorothy Kenneth
Fred Joyce Ronald
Carolyn Jonathan Julia
Deborah Aaron Carolyn
Marie Adam Samuel
Jerry Kevin Fred
Kimberly Alice Fred
Lawrence Louis Keith
Anthony Edward Matthew
Evelyn Heather Evelyn
Norma Dorothy Harry
使用 numpy
是另一种方法:
import math
import numpy as np
data = [['John', 1], ['Fred', 1], ['Carolyn', 1], ['Kenneth', 3], ['Ronald', 3], ['Dorothy', 2], ['Joyce', 2], ['Julia', 3], ['Deborah', 1], ['Jonathan', 2], ['Aaron', 2], ['Marie', 1], ['Adam', 2], ['Kevin', 2], ['Alice', 2], ['Jerry', 1], ['Kimberly', 1], ['Lawrence', 1], ['Louis', 2], ['Anthony', 1], ['Carolyn', 3], ['Edward', 2], ['Samuel', 3], ['Rachel', 1], ['Kathleen', 1], ['Fred', 3], ['Fred', 3], ['Gerald', 2], ['Donna', 2], ['Keith', 3], ['Matthew', 3], ['Stephanie', 1]]
array = np.array([x[0] for x in data])
array = np.resize(array,(3,math.ceil(len(array)/3))).T
[" ".join(x) for x in array]
输出:
['John Marie Samuel',
'Fred Adam Rachel',
'Carolyn Kevin Kathleen',
'Kenneth Alice Fred',
'Ronald Jerry Fred',
'Dorothy Kimberly Gerald',
'Joyce Lawrence Donna',
'Julia Louis Keith',
'Deborah Anthony Matthew',
'Jonathan Carolyn Stephanie',
'Aaron Edward John']