根据公共 id 合并元组的排名列表

Merging ranked lists of tuples based on common id

我有以下元组排序列表:

list1 = [(0.2, 'a'), (0.4, 'b'), (0.5,'d')]
list2 = [(0.1, 'a'), (0.3, 'c'), (0.7, 'x')]
list3 = [(0.5, 'c'), (0.6, 'a'), (0.5, 'b')]

我想根据常用字母创建一个总排名列表如下:

  1. 如果字母在所有三个列表中都很常见,则添加三个单独的值
  2. 如果该字母仅在两个列表之间通用,则添加两个单独的值和一个 1
  3. 如果该元素仅在一个列表中,则将其值加 2

预期结果:

[(0.9, 'a'), (1.8, 'c'), (1.9, 'b'), (2.5, 'd'), (2.7, 'x')]

什么是有效的:

如果该项目在所有三个列表中都很常见,我就能得到预期的结果,但如果是其他情况,我就无法得到正确的结果。

代码片段

list1 = [(0.2, 'a'), (0.4, 'b'), (0.5, 'd')]
list2 = [(0.1, 'a'), (0.3, 'c'), (0.7, 'x')]
list3 = [(0.5, 'c'), (0.6, 'a'), (0.5, 'b')]
priority_result = [] # when element is common in all 3 lists
twos_array = [] #when element is common in only two lists

result = [(s1, l1 + l1) for (l1, s1), (l1, s2) in zip(list1, list2)]
print(result)
for (score, resultID) in list1:
    for (score1, resultID1) in list2:
        for (score2, resultID2) in list3:
            if(resultID == resultID1 or resultID == resultID2):                    
                result = [(score + score1 + score2, resultID)]
                priority_result.extend(result)
            elif(resultID == resultID1 and resultID != resultID2):
                result = [(score + score1 + 1, resultID)]
                twos_array.extend(result)

我该如何处理才能产生预期的结果?

您可以交换元组的顺序来创建映射:

d1 = dict(x[::-1] for x in list1)
d2 = dict(x[::-1] for x in list2)
d3 = dict(x[::-1] for x in list3)

现在你可以合并键,因为 dict.keys returns 一个 set 类对象:

keys = d1.keys() | d2.keys() | d3.keys()

剩下的可以用dict.get完成:

result = {k: d1.get(k, 1) + d2.get(k, 1) + d3.get(k, 1) for k in keys}

将其转换为排序列表非常简单:

sorted(x[::-1] for x in result.items())

假设您的列表现在在元列表中:

lists = [list1, list2, list3]
keys = set().union(*lists)
dicts = [dict(x[::-1] for x in l) for l in lists]
result = {k: sum(d.get(k, 1) for d in dicts) for k in keys}
result = sorted(x[::-1] for x in result.items())

这里有一个稍微简单的解决方案:

mapping = dict.fromkeys(set().union(*lists), len(lists))
for v, k in itertools.chain.from_iterable(lists):
    mapping[k] += v - 1
result = sorted(x[::-1] for x in result.items())

您可以使用 collections.Counter 为您完成大部分数学运算:

c = Counter()
for lst in lists:
    c.update({k: v - 1 for v, k in lst})
result = [(v + len(lists), k) for k, v in c.items()]

与常规 collections.defaultdict 相同的是:

d = defaultdict(int)
for v, k in itertools.chain.from_iterable(lists):
    d[k] += v - 1
result = [(v + len(lists), k) for k, v in d.items()]

您可以尝试使用 itertools.groupby with operator.itemgetter:

from itertools import groupby
from operator import itemgetter
x = list1 + list2 + list3
y = [l[1] for l in x]
print(sorted([((3 - y.count(key)) + sum(next(zip(*l))), key) for key, l in groupby(sorted(x, key=ig(1)), key=ig(1))], key=ig(0)))

[(0.9, 'a'), (1.8, 'c'), (1.9, 'b'), (2.5, 'd'), (2.7, 'x')]

此代码连接列表并创建另一个仅包含键的列表,并按键分组,对值求和。它还根据出现的次数增加了预期的价值增长。

最后按求和值和修改值排序。

list1 = [(0.2, 'a'), (0.4, 'b'), (0.5, 'd')]
list2 = [(0.1, 'a'), (0.3, 'c'), (0.7, 'x')]
list3 = [(0.5, 'c'), (0.6, 'a'), (0.5, 'b')]

d = {}
for t in list1 + list2 + list3:
    d.setdefault(t[1], []).append(t[0])
lst = [(sum(v, 3 - len(v)), k) for k, v in d.items()]
print(lst)  # [(0.9, 'a'), (1.9, 'b'), (2.5, 'd'), (1.8, 'c'), (2.7, 'x')]