为什么我的实现不是 O(NlogN)?
Why isn't my implementation O(NlogN)?
我正在实施和测试 this SO question -
的答案
Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
The answer with the most upvotes(目前)只提供算法的文本描述,应该是O(NlogN):
Sort the array... .
For each element x in the array:
Consider the array slice after the element.
Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0.
Output all elements (x, y) from y0 forwards as long as x + y <= b. ... If you only need to count the number of pairs, you can do it in O(nlogn). Modify the above algorithm so [b - x] (or the next smaller element) is also searched for.
我的实现:
import bisect
def ani(arr, a, b):
# Sort the array (say in increasing order).
arr.sort()
count = 0
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
# Do a binary search on this array slice for [a - x], call it y0
lower = a - x
y0 = bisect.bisect_left(after, lower)
# If you only need to count the number of pairs
# Modify the ... algorithm so [b - x] ... is also searched for
upper = b - x
y1 = bisect.bisect_right(after, upper)
count += y1 - y0
return count
当我绘制时间与 N 或 N 的某些函数时,我看到了指数或 N^2 响应。
# generate timings
T = list() # run-times
N = range(100, 10001, 100) # N
arr = [random.randint(-10, 10) for _ in xrange(1000000)]
print 'start'
start = time.time()
for n in N:
arr1 = arr[:n]
t = Timer('ani(arr1, 5, 16)', 'from __main__ import arr1, ani')
timing_loops = 100
T.append(t.timeit(timing_loops) / timing_loops)
是我的实现不正确还是作者的说法不正确?
这里是一些数据图。
T 对 N
T / NlogN vs N - 一位评论者认为这不应该产生线性图 - 但它确实如此。
T vs NlogN - 如果复杂度为 NlogN,我认为这应该是线性的,但事实并非如此。
如果不出意外,这是你的错误:
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
arr[ndx+1:]
创建长度列表的副本 len(arr) - ndx
,因此您的循环是 O(n^2)。
而是使用 lo
和 hi
参数 bisect.bisect
。
我正在实施和测试 this SO question -
的答案Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
The answer with the most upvotes(目前)只提供算法的文本描述,应该是O(NlogN):
Sort the array... . For each element x in the array: Consider the array slice after the element. Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0. Output all elements (x, y) from y0 forwards as long as x + y <= b. ... If you only need to count the number of pairs, you can do it in O(nlogn). Modify the above algorithm so [b - x] (or the next smaller element) is also searched for.
我的实现:
import bisect
def ani(arr, a, b):
# Sort the array (say in increasing order).
arr.sort()
count = 0
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
# Do a binary search on this array slice for [a - x], call it y0
lower = a - x
y0 = bisect.bisect_left(after, lower)
# If you only need to count the number of pairs
# Modify the ... algorithm so [b - x] ... is also searched for
upper = b - x
y1 = bisect.bisect_right(after, upper)
count += y1 - y0
return count
当我绘制时间与 N 或 N 的某些函数时,我看到了指数或 N^2 响应。
# generate timings
T = list() # run-times
N = range(100, 10001, 100) # N
arr = [random.randint(-10, 10) for _ in xrange(1000000)]
print 'start'
start = time.time()
for n in N:
arr1 = arr[:n]
t = Timer('ani(arr1, 5, 16)', 'from __main__ import arr1, ani')
timing_loops = 100
T.append(t.timeit(timing_loops) / timing_loops)
是我的实现不正确还是作者的说法不正确?
这里是一些数据图。
T 对 N
如果不出意外,这是你的错误:
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
arr[ndx+1:]
创建长度列表的副本 len(arr) - ndx
,因此您的循环是 O(n^2)。
而是使用 lo
和 hi
参数 bisect.bisect
。