根据项目的功能条件重新排列字典

Rearranging a dictionary based on a function-condition over its items

(关于 我几天前提出的问题)

我有一个字典,它的键是字符串,它的值是整数集,例如:

db = {"a":{1,2,3}, "b":{5,6,7}, "c":{2,5,4}, "d":{8,11,10,18}, "e":{0,3,2}}

我想要一个过程来连接其值满足外部函数中给定的特定通用条件的键。因此,新项目将把两个键的并集作为键(顺序并不重要)。该值将由条件本身决定。

例如:给定这个条件函数:

def condition(kv1: tuple, kv2: tuple):
  key1, val1 = kv1
  key2, val2 = kv2

  union = val1 | val2 #just needed for the following line
  maxDif = max(union) - min(union)

  newVal = set()
  for i in range(maxDif):
    auxVal1 = {pos - i for pos in val2}
    auxVal2 = {pos + i for pos in val2}
    intersection1 = val1.intersection(auxVal1)
    intersection2 = val1.intersection(auxVal2)
    print(intersection1, intersection2)
    if (len(intersection1) >= 3):
      newVal.update(intersection1)
    if (len(intersection2) >= 3):
      newVal.update({pos - i for pos in intersection2})

  if len(newVal)==0:
    return False
  else:
    newKey = "".join(sorted(key1+key2))
    return newKey, newVal

也就是说,令人满意的一对项目在它们的值中至少有 3 个数字在它们之间相同的距离(差异)。如前所述,如果满足,则生成的键是两个键的并集。对于这个特定示例,该值是原始值集中的(最小)匹配数。

如何巧妙地将这样的函数应用到像 db 这样的字典中?给定上述字典,预期结果将是:

result = {"ab":{1,2,3}, "cde":{0,3,2}, "d":{18}}

在这种情况下,您的“条件”不仅仅是一个条件。它实际上是合并规则,用于标识要保留的值和要删除的值。根据模式和合并规则的不同,这可能允许也可能不允许通用方法。

鉴于此,每个合并操作都可能在原始键中留下值,这些值可能会与一些剩余键合并。也可能发生多个合并(例如键“cde”)。理论上,合并过程需要覆盖所有密钥的幂集,这可能是不切实际的。或者,这可以通过使用(原始 and/or 合并的)键对的连续改进来执行。

合并condition/function:

db = {"a":{1,2,3}, "b":{5,6,7}, "c":{2,5,4}, "d":{8,11,10,18}, "e":{0,3,2}}

from itertools import product
from collections import Counter

# Apply condition and return a keep-set and a remove-set
# the keep-set will be empty if the matching condition is not met
def merge(A,B,inverted=False):
    minMatch = 3
    distances = Counter(b-a for a,b in product(A,B) if b>=a)
    delta     = [d for d,count in distances.items() if count>=minMatch]
    keep      = {a for a in A if any(a+d in B for d in delta)}
    remove    = {b for b in B if any(b-d in A for d in delta)}
    if len(keep)>=minMatch: return keep,remove
    return None,None
    
    
print( merge(db["a"],db["b"]) )  # ({1, 2, 3}, {5, 6, 7})
print( merge(db["e"],db["d"]) )  # ({0, 2, 3}, {8, 10, 11})   

合并过程:

# combine dictionary keys using a merging function/condition
def combine(D,mergeFunction):
    result  = { k:set(v) for k,v in D.items() }  # start with copy of input
    merging = True    
    while merging:    # keep merging until no more merges are performed
        merging = False   
        for a,b in product(*2*[list(result.keys())]): # all key pairs
            if a==b: continue
            if a not in result or b not in result: continue # keys still there?
            m,n = mergeFunction(result[a],result[b])        # call merge function
            if not m : continue                             # if merged ...
            mergedKey = "".join(sorted(set(a+b)))             # combine keys
            result[mergedKey] = m                             # add merged set
            if mergedKey != a: result[a] -= m; merging = True # clean/clear
            if not result[a]: del result[a]                   # original sets,
            if mergedKey != b: result[b] -= n; merging = True # do more merges
            if not result[b]: del result[b]
    return result