优化 python 堆中的元素搜索

Question

我正在 python 堆中寻找一个对象。从技术上讲，寻找它的缺失，但我认为逻辑是相似的。

heap = []
heapq.heappush(heap, (10, object))
if object not in [k for v, k in heap]:
    ## code goes here ##

但是，此检查是我的程序中最长（处理器最密集）的部分，因为堆中有大量元素。

这个搜索可以优化吗？如果是，怎么做？

Answer 1

你不能用 heapq 做到这一点，但这里有一个兼容的实现，只要堆不包含同一元素的多个副本就可以工作。

https://github.com/elplatt/python-priorityq

Answer 2

heapq 是优先级队列的二叉堆实现。二叉堆构成了一个非常高效的优先级队列，但正如您所发现的，查找项目需要顺序搜索。

如果您只需要知道一个项目是否在队列中，那么最好的办法可能是在队列中维护一个字典。因此，当您向队列中添加内容时，您的代码类似于：

"""
 I'm not really a python guy, so the code probably has syntax errors.
 But I think you get the idea.
"""
theQueue = [];
queueIndex = {};

queueInsert(item)
    if (item.key in queueIndex)
        // item already in queue. Exit.
    heapq.heappush(theQueue, item);
    queueIndex[item.key] = 1

queuePop()
    result = heapq.heappop();
    del queueIndex[result.key];
    return result;

请注意，如果您要放入堆中的项目是数字或字符串等基本类型，那么您需要将 item.key 替换为 item。

另外请注意，如果您可以在队列中放置重复项，这将无法正常工作。不过，您可以修改它以允许这样做。您只需要维护项目的计数，以便在计数变为 0 之前不会从索引中删除。

优化 python 堆中的元素搜索

Optimizing element searching in a python Heap

python

heap