对具有重复约束的列表列表进行排序

Question

我有一个列表，其中每个子列表中的第一个元素代表产品，第二个元素代表价格：

my_list = [['a',100],['b',100],['a',75],['c',120],['a',400],['c',150]]

我想按价格降序排序，但我希望产品仅在每个产品中的一个已经被看过后重复。在示例中，我有三个不同的产品：'a', 'b', 'c' 那么排序将是：

sorted_list = [['a',400],['c',150],['b',100],['c',120],['a',100],['a',75]]

这可以一次性完成吗？

Answer 1

按产品分组，分别对每个产品列表进行排序，然后交错排列。

分组通常使用列表字典完成。可以使用 itertools.zip_longest.

进行交织

from itertools import chain, zip_longest
from operator import itemgetter

def interleave_sort(l, marker="__dummyvalue__"):
    # GROUPING
    groups = {}
    for product, price in l:
        groups.setdefault(product, []).append(price)
    for product_list in groups.values():
        product_list.sort(reverse=True)
    sublists = ([[product, price] for price in prices] for product, prices in groups.items())
    # INTERLEAVING
    _marker = (marker, 0)
    i = chain.from_iterable(sorted(r, key=itemgetter(1), reverse=True) for r in zip_longest(*sublists, fillvalue=_marker))
    return (x for x in i if x is not _marker)

print(list(interleave_sort([['a',100],['b',100],['a',75],['c',120],['a',400],['c',150]])))
# [['a', 400], ['c', 150], ['b', 100], ['c', 120], ['a', 100], ['a', 75]]

print(list(interleave_sort([['a', 2], ['a', 2], ['a', 2], ['b', 1], ['b', 1], ['b', 1], ['c', 0], ['c', 0], ['c', 0]])))
# [['a', 2], ['b', 1], ['c', 0], ['a', 2], ['b', 1], ['c', 0], ['a', 2], ['b', 1], ['c', 0]]

请注意，最后两行 i = ... 和 return... 改编自 the source code of more_itertools.interleave_longest。我刚刚在其中添加了一个额外的 sorted(...) 以满足您的要求。

Answer 2

Is this possible in one pass?

不，总的来说，这不可能一次完成。

这可以通过列表排序问题的归约来证明。

考虑只有一种产品的特殊情况：

my_list = [['a', 1], ['a', 7], ['a', 9], ['a', 3], ['a', 5], ['a', 4], ['a', 8], ['a', 0], ['a', 6], ['a', 2]]

那你的问题就和排序列表一样难了。总的来说，排序不能一次性完成；它至少需要 n log2(n) 次操作，其中 n 是列表中的项目数。 That's a theorem.

对具有重复约束的列表列表进行排序

Sort a list of lists with duplication constraint

python

sorting