如何根据概率选择随机指数？

Question

我有一个数字列表，我正在尝试编写一个函数来选择 n 个随机索引，这样我的可能性是百分比[i]

函数：

def choose_randomly(probabilities, n):
    percentages = accumulated_s(probabilities)
    result = []
    for i in range(n):
        r = random()
        for j in range(n):
            if r < percentages[j]:
                result = result + [j]
    return result

accumulated_s 只会生成相应的概率列表。

我期待这样的结果：

choose_randomly([1, 2, 3, 4], 2) -> [3 3 0]
choose_randomly([1, 2, 3, 4], 2) -> [1 3 1]

问题是这没有返回 n 个索引。谁能指出我做错了什么？非常感谢！

Answer 1

找到合适的概率范围后，您就大功告成了； break 跳出内部循环以生成下一个值，否则您将表现得好像所有高于正确阈值的概率也都匹配：

    # Enumerate all percentages, not just first n
    for j, pct in enumerate(percentages):
        if r < pct:
            result.append(j)  # Don't create tons of temporary lists; mutate in place
            break  # <-- Don't add more results

另请注意，如果概率集中有很多值，使用 bisect module 中的函数来查找正确值可能更有意义，而不是每次都线性扫描；对于 percentages 中的少量条目，线性扫描很好，但对于大量条目，O(log n) 查找可能胜过 O(n) 扫描。

如何根据概率选择随机指数？

How can I choose random indicies based on probability?

python

probability