通过嵌套字典键获取唯一列表项

Get unique list items by nested dictionary key

我将列表作为字典值嵌套在另一个名为 data 的字典中。我一直在尝试找到一种从特定嵌套键(如 key1key2)获取所有唯一列表项的快速方法。

我想出了以下功能,这似乎不是很有效。我有什么想法可以加快速度并变得更加 pythonic 吗?

Python函数

def get_uniq_by_value(data, val_name):
    results = []
    for key, value in data.iteritems():
        for item in value[val_name]:
            if item not in results:
                results.append(item)
    return results

示例数据

data = {
"top1": {
    "key1": [
        "there is no spoon", "but dictionaries are hard",
    ],
    "key2": [
        "mad max fury road was so good",
    ]
},
"top2": {
    "key1": [
        "my item", "foo bar"
    ],
    "key2": [
        "blah", "more junk"
    ]
},

如果顺序无关紧要,您可以使用 set / set comprehension 来获得所需的结果 -

def get_uniq_by_value(data, val_name):
    return {val for value in data.values() for val in value.get(val_name,[])}

如果您想要一个列表作为结果,您可以使用 list() 对集合理解在返回之前将结果集转换为列表。

演示 -

>>> def get_uniq_by_value(data, val_name):
...     return {val for value in data.values() for val in value.get(val_name,[])}
...
>>> data = {
... "top1": {
...     "key1": [
...         "there is no spoon", "but dictionaries are hard",
...     ],
...     "key2": [
...         "mad max fury road was so good",
...     ]
... },
... "top2": {
...     "key1": [
...         "my item", "foo bar"
...     ],
...     "key2": [
...         "blah", "more junk"
...     ]
... }}
>>> get_uniq_by_value(data,"key1")
{'but dictionaries are hard', 'my item', 'foo bar', 'there is no spoon'}

如以下评论所示,如果顺序很重要并且 data 已经是 OrderedDictcollections.OrderedDict,您可以使用新的 OrderedDict,并且将列表中的元素添加为键,OrderedDict 将避免任何重复并保留添加键的顺序。

您也可以按照注释中的说明使用 OrderedDict.fomkeys 在一行中完成此操作。示例 -

from collections import OrderedDict
def get_uniq_by_value(data, val_name):
    return list(OrderedDict.fromkeys(val for value in data.values() for val in value.get(val_name,[])))

请注意,这只适用于 data 是嵌套的 OrderedDict,否则 data 的元素根本不会以任何特定顺序开始。

演示 -

>>> from collections import OrderedDict
>>> data = OrderedDict([
... ("top1", OrderedDict([
...     ("key1", [
...         "there is no spoon", "but dictionaries are hard",
...     ]),
...     ("key2", [
...         "mad max fury road was so good",
...     ])
... ])),
... ("top2", OrderedDict([
...     ("key1", [
...         "my item", "foo bar"
...     ]),
...     ("key2", [
...         "blah", "more junk"
...     ])
... ]))])
>>>
>>> def get_uniq_by_value(data, val_name):
...     return list(OrderedDict.fromkeys(val for value in data.values() for val in value.get(val_name,[])))
...
>>> get_uniq_by_value(data,"key1")
['there is no spoon', 'but dictionaries are hard', 'my item', 'foo bar']