在最大堆中找到第二大元素的最快算法（有重复项）

Question

如果你有一个包含 n 个整数的最大堆，找到第二大元素的最有效方法是什么？堆可以包含重复项，因此具有 n-1 最大值和 1 其他值的堆将 return 其他值

例如，包含数字的堆：

4,4,4,4,4,4,4,3,4

将 return 值 3。

有没有比 O(n) 运行时更快的方法来做到这一点？

Answer 1

没有比 O(n) 更好的时间复杂度的方法了。使用您提供的示例数据 (4,4,4,4,4,4,4,3,4)，例如，堆可以是以下两者之一：

             4                      4
           /   \                  /   \
         4       4              4       4
        / \     / \            / \     / \
       4   4   4   4          4   4   3   4
      / \                    / \
     4   3                  4   4

... 3 可以在任何叶节点中，因为这取决于插入顺序。当你从根开始遍历时，没有办法知道3是在左边还是右边。

如果您愿意使用稍微替代的数据结构，那么可以在 O(1):

中完成

在堆中存储唯一值。使用哈希图存储有关您添加的值的信息。在简单的情况下，这个 "information" 可能是一个出现计数器。所以下次你想在结构中插入 same 值时，你会检测到它已经在 hashmap 中并且只会增加相应的出现计数器而不接触堆。

对于上面的例子，数据结构如下：

    heap              hashmap
                    key | value (=frequency)
       4           -----+-------------------
      /              4  |  8
     3               3  |  1

如果您的数据元素是将键与一些相关数据（属性）组合在一起的复杂结构，那么您仍然只能将键存储在堆中而不会重复。然后哈希图不会为每个键提供一个计数器，而是共享相同键的实际数据元素的数组。

所以要清楚，插入、删除和查找等操作的实现必须定制。下面是一些伪代码，假设存在两个具有相应行为的变量 heap 和 hashmap：

function insert(element):
    key = element.key
    if key not in hashmap:
        hashmap.set(key, new Array)
        heap.insert(key)
    arr = hashmap.get(key) # get a reference to the array
    arr.append(element) # add element to array, affecting the hashmap-stored value

function extract(): # remove max element
    if heap.size() == 0:
        return # No more data
    key = heap.peek() # look at root value
    arr = hashmap.get(key) # get a reference to the array
    element = arr.pop() # extract from array, affecting the hashmap-stored value
    if arr.length() == 0:
        heap.extract()
        hashmap.delete(key)
    return element

function peek(): # return element with max value
    if heap.size() == 0:
        return # No more data
    key = heap.peek() # look at root value
    arr = hashmap.get(key)
    element = arr[-1] # last element in array
    return element

可以得到小于最大值的最大值如下：

key = max(heap.root.children())

... 然后根据您对 return 值的期望，您还可以从哈希图中获取相应的数据元素，甚至是所有数据元素（当存在重复项时）。

在最大堆中找到第二大元素的最快算法（有重复项）

Fastest algorithm to find 2nd biggest element in Max Heap (with duplicates)

algorithm

heap

data-structures