理解二分搜索背后的直觉,寻找较小(最左边的 occ.)和较大(最右边的 occ.)元素(有重复)的计数

Understand the intuition behind Binary Search for finding counts of lesser (leftmost occ.) and greater (rightmost occ.) elements (has duplicates)

参考:https://en.wikipedia.org/wiki/Binary_search_algorithm

function binary_search_leftmost(A, n, T):
L := 0
R := n
while L < R:
    m := floor((L + R) / 2)
    if A[m] < T:
        L := m + 1
    else:
        R := m
return L

维基百科提供了这个函数来查找元素最左边的出现,这神奇地(至少对我来说 :P)表示元素的数量少于给定的目标(无论目标是否在数组中)

例如,考虑一个大小为 10 的数组 [1, 2, 4, 4, 4, 4, 4, 7, 9, 16]

wiki_leftmost(lst, 0) == 0 
0 is not in the array. 
The answer 0 represents the elements < 0 which is 0.

wiki_leftmost(lst, 1) == 0
1 is in the array. The answer 0 represent in the index of the leftmost occurence of 1.
The answer also represents the elements < 1 which is 1.

wiki_leftmost(lst, 4) == 2
4 is in the array. The answer 2 represent in the index of the leftmost occurence of 4.
The answer also represents the elements < 4 which is 2.

wiki_leftmost(lst, 7) == 7
7 is in the array. The answer 7 represent in the index of the leftmost occurence of 4.
The answer also represents the elements < 7 which is 7.

wiki_leftmost(lst, 10) == 9
10 is not in the array.
The answer 9 represents the elements < 10 which is 9.

wiki_leftmost(lst, 16) == 9
16 is in the array. The answer 9 represent in the index of the leftmost occurence of 16.
The answer also represents the elements < 16 which is 9.

wiki_leftmost(lst, 200) == 10
20 is not in the array.
The answer 10 represents the elements < 200 which is 10.

我理解正常二分查找背后的直觉(非常简单)及其替代版本(没有相等条件的那个)(两者都存在于 wiki 页面中)。但我不太能理解这个算法背后的直觉。我举了一些例子并完成了整个流程,但我需要一个适当的直觉来可视化和理解更多的算法。

PS: 有几个类似的问题,但是其中 none 被提到了所有这些细节,答案不是我能接受的。

function binary_search_leftmost(A, n, T): 如果 n 是数组中元素的数量 A 并且 T 是要搜索的元素, 当您分别分配 LR0n 时。 R 的值将超出索引,因为索引以 0 开始,它应该结束 1 小于元素总数 (n)

所以,函数应该是这样的:

function binary_search_leftmost(A, n, T):
L := 0
R := n - 1
while L < R:
    m := floor((L + R) / 2)
    if A[m] < T:
        L := m + 1
    else:
        R := m - 1
return L

目标: 找到最左边的 A[L] == T 并且 L 最小。

我们重塑我们的目标以检查是否相等,因为原始的二分搜索可能会在多个重复元素的中间某处给出一个元素,相反我们说如果我们能找到最后一个小于 T 的元素的索引,我们的答案可能如果它存在,则成为它之后的下一个元素。请参阅伪代码中的注释。

function binary_search_leftmost(A, n, T):
    L := 0
    R := n
    while L < R:                      
        m := floor((L + R) / 2)       # partition in 2 parts get middle
        if A[m] < T:                  # last element seen so far which is less than T
            L := m + 1                # this could possibly be A[L] == T, if not keep searching towards right
        else:
            R := m                    # A[m] >= T, mth element could be our answer
    return L                          # after this we still need to check if L < n and A[L] == T

类似地,对于最右边的匹配,我们重构以寻找大于 T 的最右边的元素,我们的目标元素可能就在最后一个最右边的元素之前。请参阅伪代码中的注释。

function binary_search_rightmost(A, n, T):
    L := 0
    R := n
    while L < R:
        m := floor((L + R) / 2) # partition in 2 parts get middle
        if A[m] > T:            # last element seen so far which is greater than T
            R := m              # our answer lies between [L, m], since we are interested in A[m] > T and it is possible A[m] is the only greater element than T
        else:
            L := m + 1          # since A[m] <= T and we are only interested in A[i] > T
    return R - 1 # return previous element, still need to check if R-1 >= 0 and A[R-1] == T