如何根据分布选择样本?
How to choose sample according to distribution?
我有一个元素数组 [a_1, a_2, ... a_n] 和与这些元素相关的概率数组 [p_1, p_2, ..., p_n].
我想从 [a_1,...a_n] 中选择 "k" 个元素,k << n,根据概率 [p_1,p_2,...,p_n].
如何在 python 中编码?非常感谢,我没有编程经验
也许您想要类似的东西?
import random
data = ['a', 'b', 'c', 'd']
probabilities = [0.5, 0.1, 0.9, 0.2]
for _ in range(10):
print([d for d,p in zip(data,probabilities) if p>random.random()])
以上将输出如下内容:
['c']
['c']
['a', 'c']
['a', 'c']
['a', 'c']
[]
['a', 'c']
['c', 'd']
['a', 'c']
['d']
使用numpy.random.choice
示例:
from numpy.random import choice
sample_space = np.array([a_1, a_2, ... a_n]) # substitute the a_i's
discrete_probability_distribution = np.array([p_1, p_2, ..., p_n]) # substitute the p_i's
# picking N samples
N = 10
for _ in range(N):
print(choice(sample_space, discrete_probability_distribution)
我有一个元素数组 [a_1, a_2, ... a_n] 和与这些元素相关的概率数组 [p_1, p_2, ..., p_n].
我想从 [a_1,...a_n] 中选择 "k" 个元素,k << n,根据概率 [p_1,p_2,...,p_n].
如何在 python 中编码?非常感谢,我没有编程经验
也许您想要类似的东西?
import random
data = ['a', 'b', 'c', 'd']
probabilities = [0.5, 0.1, 0.9, 0.2]
for _ in range(10):
print([d for d,p in zip(data,probabilities) if p>random.random()])
以上将输出如下内容:
['c']
['c']
['a', 'c']
['a', 'c']
['a', 'c']
[]
['a', 'c']
['c', 'd']
['a', 'c']
['d']
使用numpy.random.choice
示例:
from numpy.random import choice
sample_space = np.array([a_1, a_2, ... a_n]) # substitute the a_i's
discrete_probability_distribution = np.array([p_1, p_2, ..., p_n]) # substitute the p_i's
# picking N samples
N = 10
for _ in range(N):
print(choice(sample_space, discrete_probability_distribution)