如何提高np.random.choice()循环效率
How to improve np.random.choice() looping efficiency
我正在尝试将 np.random.choice
应用于具有不同权重的大数组,想知道有什么方法可以避免循环并提高性能?在这里 len(weights)
可能是数百万。
weights = [[0.1, 0.5, 0.4],
[0.2, 0.4, 0.4],
...
[0.3, 0.3, 0.4]]
choice = [1, 2, 3]
ret = np.zeros((len(weights), 20))
for i in range(len(weights)):
ret[i] = np.random.choice(choice, 20, p=weights[i])
看起来结果数组的每一行都独立于其他行。我不确定现在的表现有多糟糕。如果这真的是一个问题,我会尝试使用 python 的 multiprocessing
模块来 运行 随机数生成,同时进行多个进程。应该有帮助。
借用 alongwith 的想法,这是一种矢量化方式 -
def random_choice_prob_vectorized(weights, num_items, choice=None):
weights = np.asarray(weights)
w = weights.cumsum(1)
r = np.random.rand(len(weights),num_items)
m,n = w.shape
o = np.arange(m)[:,None]
w_o = (w+o).ravel()
r_o = (r+o).ravel()
idx = np.searchsorted(w_o,r_o).reshape(m,-1)%n
if choice is not None:
return np.asarray(choice)[idx]
else:
return idx
示例 运行 使用 -
验证
In [28]: weights = [[0.1, 0.5, 0.4],
...: [0.2, 0.4, 0.4],
...: [0.3, 0.3, 0.4]]
...:
...: choice = [1, 2, 3]
...: num_items = 20000
In [29]: out = random_choice_prob_vectorized(weights, num_items, choice)
# Use 2D bincount to get per average occurences and verify against weights
In [75]: bincount2D_vectorized(out)/num_items
Out[75]:
array([[0. , 0.09715, 0.4988 , 0.40405],
[0. , 0.1983 , 0.40235, 0.39935],
[0. , 0.30025, 0.29485, 0.4049 ]])
这是我在 中的回答的概括:
def vectorized_choice(p, n, items=None):
s = p.cumsum(axis=1)
r = np.random.rand(p.shape[0], n, 1)
q = np.expand_dims(s, 1) >= r
k = q.argmax(axis=-1)
if items is not None:
k = np.asarray(items)[k]
return k
p
应该是一个二维数组,其行是概率向量。 n
是从每行定义的分布中抽取的样本数。如果items
是None,样本是range(0, p.shape[1])
中的整数。如果items
不是None,应该是一个长度为p.shape[1]
.
的序列
示例:
In [258]: p = np.array([[0.1, 0.5, 0.4], [0.75, 0, 0.25], [0, 0, 1], [1/3, 1/3, 1/3]])
In [259]: p
Out[259]:
array([[0.1 , 0.5 , 0.4 ],
[0.75 , 0. , 0.25 ],
[0. , 0. , 1. ],
[0.33333333, 0.33333333, 0.33333333]])
In [260]: vectorized_choice(p, 20)
Out[260]:
array([[1, 1, 2, 1, 1, 2, 2, 2, 1, 2, 1, 1, 1, 2, 2, 0, 1, 2, 2, 2],
[0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[1, 0, 2, 2, 0, 1, 2, 1, 0, 0, 0, 0, 2, 2, 0, 0, 2, 1, 1, 2]])
In [261]: vectorized_choice(p, 20, items=[1, 2, 3])
Out[261]:
array([[2, 1, 2, 2, 2, 3, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, 2, 3, 2, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 1, 1, 3, 3, 1, 3, 1, 1, 1],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[3, 3, 3, 1, 3, 2, 1, 2, 3, 1, 2, 2, 3, 2, 1, 2, 1, 2, 2, 2]])
p
的时间 (1000000, 3)
:
In [317]: p = np.random.rand(1000000, 3)
In [318]: p /= p.sum(axis=1, keepdims=True)
In [319]: %timeit vectorized_choice(p, 20, items=np.arange(1, p.shape[1]+1))
1.89 s ± 28.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Divakar 的功能时间如下:
In [320]: %timeit random_choice_prob_vectorized(p, 20, choice=np.arange(1, p.shape[1]+1))
7.33 s ± 43.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
如果增加 p
中的列数,差异会不太明显,如果列数足够大,Divakar 的功能会更快。例如
In [321]: p = np.random.rand(1000, 120)
In [322]: p /= p.sum(axis=1, keepdims=True)
In [323]: %timeit vectorized_choice(p, 20, items=np.arange(1, p.shape[1]+1))
6.41 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [324]: %timeit random_choice_prob_vectorized(p, 20, choice=np.arange(1, p.shape[1]+1))
6.29 ms ± 342 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
我正在尝试将 np.random.choice
应用于具有不同权重的大数组,想知道有什么方法可以避免循环并提高性能?在这里 len(weights)
可能是数百万。
weights = [[0.1, 0.5, 0.4],
[0.2, 0.4, 0.4],
...
[0.3, 0.3, 0.4]]
choice = [1, 2, 3]
ret = np.zeros((len(weights), 20))
for i in range(len(weights)):
ret[i] = np.random.choice(choice, 20, p=weights[i])
看起来结果数组的每一行都独立于其他行。我不确定现在的表现有多糟糕。如果这真的是一个问题,我会尝试使用 python 的 multiprocessing
模块来 运行 随机数生成,同时进行多个进程。应该有帮助。
借用
def random_choice_prob_vectorized(weights, num_items, choice=None):
weights = np.asarray(weights)
w = weights.cumsum(1)
r = np.random.rand(len(weights),num_items)
m,n = w.shape
o = np.arange(m)[:,None]
w_o = (w+o).ravel()
r_o = (r+o).ravel()
idx = np.searchsorted(w_o,r_o).reshape(m,-1)%n
if choice is not None:
return np.asarray(choice)[idx]
else:
return idx
示例 运行 使用
In [28]: weights = [[0.1, 0.5, 0.4],
...: [0.2, 0.4, 0.4],
...: [0.3, 0.3, 0.4]]
...:
...: choice = [1, 2, 3]
...: num_items = 20000
In [29]: out = random_choice_prob_vectorized(weights, num_items, choice)
# Use 2D bincount to get per average occurences and verify against weights
In [75]: bincount2D_vectorized(out)/num_items
Out[75]:
array([[0. , 0.09715, 0.4988 , 0.40405],
[0. , 0.1983 , 0.40235, 0.39935],
[0. , 0.30025, 0.29485, 0.4049 ]])
这是我在
def vectorized_choice(p, n, items=None):
s = p.cumsum(axis=1)
r = np.random.rand(p.shape[0], n, 1)
q = np.expand_dims(s, 1) >= r
k = q.argmax(axis=-1)
if items is not None:
k = np.asarray(items)[k]
return k
p
应该是一个二维数组,其行是概率向量。 n
是从每行定义的分布中抽取的样本数。如果items
是None,样本是range(0, p.shape[1])
中的整数。如果items
不是None,应该是一个长度为p.shape[1]
.
示例:
In [258]: p = np.array([[0.1, 0.5, 0.4], [0.75, 0, 0.25], [0, 0, 1], [1/3, 1/3, 1/3]])
In [259]: p
Out[259]:
array([[0.1 , 0.5 , 0.4 ],
[0.75 , 0. , 0.25 ],
[0. , 0. , 1. ],
[0.33333333, 0.33333333, 0.33333333]])
In [260]: vectorized_choice(p, 20)
Out[260]:
array([[1, 1, 2, 1, 1, 2, 2, 2, 1, 2, 1, 1, 1, 2, 2, 0, 1, 2, 2, 2],
[0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[1, 0, 2, 2, 0, 1, 2, 1, 0, 0, 0, 0, 2, 2, 0, 0, 2, 1, 1, 2]])
In [261]: vectorized_choice(p, 20, items=[1, 2, 3])
Out[261]:
array([[2, 1, 2, 2, 2, 3, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, 2, 3, 2, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 1, 1, 3, 3, 1, 3, 1, 1, 1],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[3, 3, 3, 1, 3, 2, 1, 2, 3, 1, 2, 2, 3, 2, 1, 2, 1, 2, 2, 2]])
p
的时间 (1000000, 3)
:
In [317]: p = np.random.rand(1000000, 3)
In [318]: p /= p.sum(axis=1, keepdims=True)
In [319]: %timeit vectorized_choice(p, 20, items=np.arange(1, p.shape[1]+1))
1.89 s ± 28.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Divakar 的功能时间如下:
In [320]: %timeit random_choice_prob_vectorized(p, 20, choice=np.arange(1, p.shape[1]+1))
7.33 s ± 43.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
如果增加 p
中的列数,差异会不太明显,如果列数足够大,Divakar 的功能会更快。例如
In [321]: p = np.random.rand(1000, 120)
In [322]: p /= p.sum(axis=1, keepdims=True)
In [323]: %timeit vectorized_choice(p, 20, items=np.arange(1, p.shape[1]+1))
6.41 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [324]: %timeit random_choice_prob_vectorized(p, 20, choice=np.arange(1, p.shape[1]+1))
6.29 ms ± 342 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)