Python 字典中并行列表的高效排序

Question

标题几乎说明了一切，我正在寻求有效地对并行列表的字典进行排序。

unsorted_my_dict = {
   'key_one': [1,6,2,3],
   'key_two': [4,1,9,7],
   'key_three': [1,2,4,3],
   ...
}
sorted_my_dict = {
   'key_one': [1,6,3,2],
   'key_two': [4,1,7,9],
   'key_three': [1,2,3,4],
   ...
}

我想对 key_three 和该字典中的所有其他列表进行并行排序。有几个类似的问题，但我很挣扎，因为我在字典中有未知数量的键要排序，而且我只知道我想要排序的键的名称（key_three）。

希望用 vanilla Python 来做到这一点，没有第 3 方依赖项。

编辑 1：并行是什么意思？我的意思是，如果我对 key_three 进行排序，这需要交换最后两个值，那么字典中的所有其他列表也将交换它们的最后两个值。

编辑 2：Python 3.4 具体来说

Answer 1

您可以先对目标列表的 enumerate 进行排序以恢复所需的索引顺序，然后按该顺序重新排列每个列表。

my_dict = {
   'key_one': [1,6,2,3],
   'key_two': [4,1,9,7],
   'key_three': [1,2,4,3],
}


def parallel_sort(d, key):
    index_order = [i for i, _ in sorted(enumerate(d[key]), key=lambda x: x[1])]
    return {k: [v[i] for i in index_order] for k, v in d.items()}

print(parallel_sort(my_dict, 'key_three'))

输出

{'key_one': [1, 6, 3, 2],
 'key_two': [4, 1, 7, 9],
 'key_three': [1, 2, 3, 4]}

Answer 2

zip将keys放在一起，根据相关项按key函数排序，，再zip恢复原形：

sorted_value_groups = sorted(zip(*unsorted_my_dict.values()), key=lambda _, it=iter(unsorted_my_dict['key_three']): next(it))
sorted_values = zip(*sorted_value_groups)
sorted_my_dict = {k: list(newvals) for k, newvals in zip(unsorted_my_dict, sorted_values)}

一点都不干净，我主要是为了好玩而发布的。一行是：

sorted_my_dict = {k: list(newvals) for k, newvals in zip(unsorted_my_dict, zip(*sorted(zip(*unsorted_my_dict.values()), key=lambda _, it=iter(unsorted_my_dict['key_three']): next(it))))}

这是可行的，因为虽然 dict 迭代顺序在 3.7 之前不能保证，但对于未修改的 dict，顺序保证是可重复的。同理，key函数是从头到尾依次执行的，所以反复迭代拉取key是安全的。我们只是分离所有值，按索引对它们进行分组，按索引键对组进行排序，按键重新组合它们，然后将它们重新附加到它们的原始键。

输出完全符合要求（原始键的顺序保留在 CPython 3.6 或任何 Python 3.7 或更高版本上）：

sorted_my_dict = {
   'key_one': [1,6,3,2],
   'key_two': [4,1,7,9],
   'key_three': [1,2,3,4]
}

Answer 3

首先根据给定的排序键，得到索引顺序。您按照该顺序重新排列字典中剩余的列表。

unsorted_my_dict = {
'key_one': [1, 6, 2, 3],
'key_two': [4, 1, 9, 7],
'key_three': [1, 2, 4, 3],
}


def sort_parallel_by_key(my_dict, key):
    def sort_by_indices(idx_seq):
        return {k: [v[i] for i in idx_seq] for k, v in my_dict.items()}

    indexes = [idx for idx, _ in sorted(enumerate(my_dict[key]), key=lambda foo: foo[1])]
    return sort_by_indices(indexes)


print(sort_parallel_by_key(unsorted_my_dict, 'key_three'))

Python 字典中并行列表的高效排序

Python efficient sort of parallel lists in dictionary

python

python-3.x

python-3.4

输出