如何合并具有公共元素的列表,其中这些元素本身是 lists/tuples?

How do I merge lists with common elements, where these elements themselves are lists/tuples?

我有这个嵌套列表,我需要以某种方式遍历它以合并具有公共元素(这些元素本身就是列表)的内部列表。这是具有现有原始数据模式的简化数据:

data = [
        [[0,1],[2,3],[4,5]], 
        [[2,3],[4,5]], 
        [[4,5]], 
        [[6,7],[8,9],[10,11]], 
        [[8,9],[10,11]], 
        [[10,11]], 
        [[12,13],[14,15],[16,17],[18,19],[20,21]], 
        [[14,15],[16,17],[18,19],[20,21]], 
        [[16,17],[18,19],[20,21]],
        [[18,19],[20,21]],
        [[20,21]]
       ]  

我想获得一个合并的嵌套列表,如下所示:

merged = [
      [[0,1],[2,3],[4,5]], 
      [[6,7],[8,9],[10,11]], 
      [[12,13],[14,15],[16,17],[18,19],[20,21]]
     ]

下面是我尝试过的方法,不幸的是它没有超出第二个内部 for loop 并且返回错误 AttributeError: 'int' object has no attribute 'values':

tmp = {}
for subl in original:
    subl = list(set(subl))                        # Eliminating duplicates by first converting to a set
    subl = dict(zip(subl, range(len(subl))))      # Create dictionary from list
    sublswitch = {y:x for x,y in subl.items()}    # Swap dictionary key for values and vice versa                
    ii = 0
    for key in sublswitch:            
        tmp.setdefault(key.values(), set()).add(list(key.values())[ii])
        ii += 1                                         

out = []
for k, v in tmp.items():
    out.append([[k, i] for i in v])

试试这个循环:

newdata = []
for lst in data:
    if newdata:
        x = [i for i in lst if i not in newdata[-1]]
        if x:
            newdata.append(x)
    else:
        newdata.append(lst)
print(newdata)

输出:

[[[0, 1], [2, 3], [4, 5]], [[6, 7], [8, 9], [10, 11]], [[12, 13], [14, 15], [16, 17], [18, 19], [20, 21]]]

这是一个使用 O(n) 附加 space 的解决方案,它跟踪所有已经使用散列添加的子列表 table(因此,所有查找都只是 O(1 )).

added = set()
merged = []

for item in data:
    filtered = list(filter(lambda value: value not in added, map(tuple, item)))
    if filtered:
        added.update(filtered)
        merged.append(list(map(list, filtered)))

print(merged)

输出

[[[0, 1], [2, 3], [4, 5]], [[6, 7], [8, 9], [10, 11]], [[12, 13], [14, 15], [16, 17], [18, 19], [20, 21]]]

更新

上面的解决方案只是防止重复的列表项在下一个列表中再次出现。在这里,我们不仅要防止它们再次发生,还要将它们合并到现有的上。结果,这个算法的时间复杂度有点O(n^2).

merged = []

for item in data:
    item_set = set(map(tuple, item))
    for result in merged:
        if result & item_set:
            result.update(item_set)
            break
    else:
        merged.append(item_set)

merged = [sorted(map(list, item)) for item in merged]
print(merged)

已更新

试试这个,

def sort_list(data):
    data = sorted(data, key=itemgetter(0))
    return_list = []
    for lst in data:
        new_list = sorted(lst, key=itemgetter(0))
        if not return_list:
            return_list.append(new_list)
            continue
        if new_list[0] not in return_list[-1]:
            return_list.append(new_list)
        if new_list[0] in return_list[-1] and len(new_list) > 
                                                len(return_list[-1]):
            return_list.remove(return_list[-1])
            return_list.append(new_list)
    return return_list

所以你可以对列表进行排序

data = [[],
    [[0,1],[2,3]],
    [[0,1],[2,3],[4,5]], 
    [[2,3],[4,5]], 
    [[4,5]], 
    [[6,7],[8,9],[10,11]], 
    [[8,9],[10,11]], 
    [[10,11]], 
    [[12,13],[14,15],[16,17],[18,19],[20,21]], 
    [[14,15],[16,17],[18,19],[20,21]], 
    [[16,17],[18,19],[20,21]],
    [[18,19],[20,21]],
    [[20,21]]
   ] 

[[],
 [[0, 1], [2, 3], [4, 5]],
 [[6, 7], [8, 9], [10, 11]],
 [[12, 13], [14, 15], [16, 17], [18, 19], [20, 21]]]