中位数之和(更快的解决方案)

Sum of medians (faster solution)

我们得到数字 N - 下一个列表的长度 (1 <= N <= 10^5)。

然后就是N个数的列表(1 <= num <= 10^9).

任务是在从 1 到 N 的每次迭代中找到中位数(在第 i 次迭代中,我们找到子数组 lst[:i] 的中位数),然后找到所有 N 个中位数的总和。


例子

输入:

10

5 10 8 1 7 3 9 6 2 4

输出:

59 (5+5+8+5+7+5+7+6+6+5)

输入2:

5

5 3 1 2 4

输出2:

16 (5+3+3+2+3)


- 这里提供了使用 BinarySearchTrees 的方法,我做到了。

但是这些限制不足以超过 2 秒的时间限制。有没有更快的解决方案?

class BinarySearchTree:
    def __init__(self, value):
        self.left = None
        self.right = None
        self.value = value

    def insert(self, value):
        if self.value:
            if value < self.value:
                if self.left is None:
                    self.left = BinarySearchTree(value)
                else:
                    self.left.insert(value)
            elif value > self.value:
                if self.right is None:
                    self.right = BinarySearchTree(value)
                else:
                    self.right.insert(value)
        else:
            self.value = value

    def output_subtree(self):
        if self.left:
            self.left.output_subtree()
        sub_tree.append(self.value)
        if self.right:
            self.right.output_subtree()


N = int(input())
vertices = list(map(int, input().split()))
medians = 0

tree = BinarySearchTree(vertices[0])
medians += vertices[0]

for i in range(1, N):
    sub_tree = []
    tree.insert(vertices[i])
    tree.output_subtree()
    if (i+1) % 2 == 0:
        medians += sub_tree[len(sub_tree)//2-1]
    else:
        medians += sub_tree[len(sub_tree)//2]

print(medians)

您可以使用双堆方法。

length = N/2

制作两个数组

第一个包含 min binary heap,第二个 - 最大二叉堆。最小堆将存储大值,最大堆 - 小值

在每次迭代中,将给定列表中的下一个元素添加到堆之一,保持大小相等(对于奇数计数器几乎相等)。

如果当前元素大于当前中位数:
如果最小堆大小等于最大堆大小,则移除最小堆的顶部,将该顶部插入最大堆
将当前元素添加到最小堆中。

如果当前元素小于当前中位数:
如果最大堆大小大于最小堆大小,将最大堆的顶部移动到最小堆
将当前元素插入最大堆

最大堆的每个stage top元素之后都是中值。

这个算法是O(NlogN),但是由于隐藏常量小,堆比搜索树运行得更快,并且不需要重新分配内存。

     min heap         max heap
5    -               (5)
10   10              (5)
8    10              (8) 5
1    8 10            (5) 1
7    8 10            (7) 5 1
3    7 8 10          (5) 3 1 
9    8 9 10          (7) 5 3 1 
6    7 8 9 10        (6) 5 3 1
...

感谢@MBo,我使用 MinHeapMaxHeap.

实现了这个问题的解决方案

MinHeap 中,顶部有最小值,任何 child 都大于其 parent。相反,MaxHeap 包含所有小元素,其中最大的元素位于根部。

这种结构让我们可以轻松地在每次迭代中更新中值。

class MaxHeap:
    def __init__(self):
        self.heapList = [0]
        self.currentSize = 0

    def percUp(self,i):
        while i // 2 > 0:
          if self.heapList[i] > self.heapList[i // 2]:
             tmp = self.heapList[i // 2]
             self.heapList[i // 2] = self.heapList[i]
             self.heapList[i] = tmp
          i = i // 2

    def insert(self,k):
      self.heapList.append(k)
      self.currentSize = self.currentSize + 1
      self.percUp(self.currentSize)

    def percDown(self,i):
      while (i * 2) <= self.currentSize:
          mc = self.maxChild(i)
          if self.heapList[i] < self.heapList[mc]:
              tmp = self.heapList[i]
              self.heapList[i] = self.heapList[mc]
              self.heapList[mc] = tmp
          i = mc

    def maxChild(self,i):
      if i * 2 + 1 > self.currentSize:
          return i * 2
      else:
          if self.heapList[i*2] > self.heapList[i*2+1]:
              return i * 2
          else:
              return i * 2 + 1

    def delMax(self):
      retval = self.heapList[1]
      self.heapList[1] = self.heapList[self.currentSize]
      self.currentSize = self.currentSize - 1
      self.heapList.pop()
      self.percDown(1)
      return retval

    def buildHeap(self,alist):
      i = len(alist) // 2
      self.currentSize = len(alist)
      self.heapList = [0] + alist[:]
      while (i > 0):
          self.percDown(i)
          i = i - 1


class MinHeap:
    def __init__(self):
        self.heapList = [0]
        self.currentSize = 0

    def percUp(self,i):
        while i // 2 > 0:
          if self.heapList[i] < self.heapList[i // 2]:
             tmp = self.heapList[i // 2]
             self.heapList[i // 2] = self.heapList[i]
             self.heapList[i] = tmp
          i = i // 2

    def insert(self,k):
      self.heapList.append(k)
      self.currentSize = self.currentSize + 1
      self.percUp(self.currentSize)

    def percDown(self,i):
      while (i * 2) <= self.currentSize:
          mc = self.minChild(i)
          if self.heapList[i] > self.heapList[mc]:
              tmp = self.heapList[i]
              self.heapList[i] = self.heapList[mc]
              self.heapList[mc] = tmp
          i = mc

    def minChild(self,i):
      if i * 2 + 1 > self.currentSize:
          return i * 2
      else:
          if self.heapList[i*2] < self.heapList[i*2+1]:
              return i * 2
          else:
              return i * 2 + 1

    def delMin(self):
      retval = self.heapList[1]
      self.heapList[1] = self.heapList[self.currentSize]
      self.currentSize = self.currentSize - 1
      self.heapList.pop()
      self.percDown(1)
      return retval

    def buildHeap(self,alist):
      i = len(alist) // 2
      self.currentSize = len(alist)
      self.heapList = [0] + alist[:]
      while (i > 0):
          self.percDown(i)
          i = i - 1

N = int(input())
lst = list(map(int, input().split()))
medians = 0

# minimal value's at the top; any child is bigger than its parent
min_heap = MinHeap()
# conversely
max_heap = MaxHeap()

# initial first values for each tree
if lst[0] > lst[1]:
    min_heap.insert(lst[0])
    max_heap.insert(lst[1])
    medians += lst[0]+lst[1]
else:
    min_heap.insert(lst[1])
    max_heap.insert(lst[0])
    medians += 2*lst[0]

# then the same procedure of the rest
for i in range(2, N):
    if lst[i] < max_heap.heapList[1]:
        max_heap.insert(lst[i])
    else:
        min_heap.insert(lst[i])
    # if the difference of size is bigger than one then balance
    # the trees moving root of the biggest tree in another one
    if min_heap.currentSize-max_heap.currentSize > 1:
        max_heap.insert(min_heap.delMin())
    elif max_heap.currentSize-min_heap.currentSize > 1:
        min_heap.insert(max_heap.delMax())
    # if the length is even we take len/2-th element; odd ==> (len+1)/2
    if max_heap.currentSize >= min_heap.currentSize:
        medians += max_heap.heapList[1]
    else:
        medians += min_heap.heapList[1]

print(medians)