从字典中的列表中有效地提取一组唯一值
Efficiently extracting set of unique values from lists within a dictionary
我的数据结构如下所示:
{'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
使用Python,我需要提取这个:
{1, 2, 3, 4, 5, 6, 7}
因为我需要计算下游数学方程式的不同值。
这是我当前的实现,它有效(完整的代码示例):
from itertools import chain
# Create som mock data for testing
dictionary_with_lists = {'A': [2, 3, 5, 6],
'B': [1, 2, 4, 7],
'C': [1, 3, 4, 5, 7],
'D': [1, 4, 5, 6],
'E': [3, 4]}
print(dictionary_with_lists)
# Output: 'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
# Flatten dictionary to list of lists, discarding the keys
list_of_lists = [dictionary_with_lists[i] for i in dictionary_with_lists]
print(f'list_of_lists: {list_of_lists}')
# Output: list_of_lists: [[2, 3, 5, 6], [1, 2, 4, 7], [1, 3, 4, 5, 7], [1, 4, 5, 6], [3, 4]]
# Use itertools to flatten the list
flat_list = list(chain.from_iterable(list_of_lists))
print(f'flat_list: {flat_list}')
# Output: flat_list: [2, 3, 5, 6, 1, 2, 4, 7, 1, 3, 4, 5, 7, 1, 4, 5, 6, 3, 4]
# Convert list to set to get only unique values
set_of_unique_items = set(flat_list)
print(f'set_of_unique_items: {set_of_unique_items}')
# Output: set_of_unique_items: {1, 2, 3, 4, 5, 6, 7}
虽然这可行,但我怀疑可能有更简单、更有效的方法。
在不降低代码可读性的情况下更有效的实现是什么?
我的现实世界词典包含数十万或数百万个任意长度的列表。
试试这个
from itertools import chain
d = {'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
print(set(chain.from_iterable(d.values())))
输出:
{1, 2, 3, 4, 5, 6, 7}
s = set()
for key in dictionary_with_lists:
for val in dictionary_with_lists[key]:
s.add(val)
局外人的观点:
dict = {'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
S = set()
for L in dict.values():
S = S.union(set(L))
我的数据结构如下所示:
{'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
使用Python,我需要提取这个:
{1, 2, 3, 4, 5, 6, 7}
因为我需要计算下游数学方程式的不同值。
这是我当前的实现,它有效(完整的代码示例):
from itertools import chain
# Create som mock data for testing
dictionary_with_lists = {'A': [2, 3, 5, 6],
'B': [1, 2, 4, 7],
'C': [1, 3, 4, 5, 7],
'D': [1, 4, 5, 6],
'E': [3, 4]}
print(dictionary_with_lists)
# Output: 'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
# Flatten dictionary to list of lists, discarding the keys
list_of_lists = [dictionary_with_lists[i] for i in dictionary_with_lists]
print(f'list_of_lists: {list_of_lists}')
# Output: list_of_lists: [[2, 3, 5, 6], [1, 2, 4, 7], [1, 3, 4, 5, 7], [1, 4, 5, 6], [3, 4]]
# Use itertools to flatten the list
flat_list = list(chain.from_iterable(list_of_lists))
print(f'flat_list: {flat_list}')
# Output: flat_list: [2, 3, 5, 6, 1, 2, 4, 7, 1, 3, 4, 5, 7, 1, 4, 5, 6, 3, 4]
# Convert list to set to get only unique values
set_of_unique_items = set(flat_list)
print(f'set_of_unique_items: {set_of_unique_items}')
# Output: set_of_unique_items: {1, 2, 3, 4, 5, 6, 7}
虽然这可行,但我怀疑可能有更简单、更有效的方法。
在不降低代码可读性的情况下更有效的实现是什么?
我的现实世界词典包含数十万或数百万个任意长度的列表。
试试这个
from itertools import chain
d = {'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
print(set(chain.from_iterable(d.values())))
输出:
{1, 2, 3, 4, 5, 6, 7}
s = set()
for key in dictionary_with_lists:
for val in dictionary_with_lists[key]:
s.add(val)
局外人的观点:
dict = {'A': [2, 3, 5, 6], 'B': [1, 2, 4, 7], 'C': [1, 3, 4, 5, 7], 'D': [1, 4, 5, 6], 'E': [3, 4]}
S = set()
for L in dict.values():
S = S.union(set(L))