Numpy 随机数生成器延迟

Numpy random number generator latency

为什么 numpy 生成随机数在重复调用的情况下比单个函数调用慢得多?

示例:

import numpy as np
import timeit

if __name__ == '__main__':


    latency_normal = timeit.timeit('np.random.uniform(size=(100,))', setup = 'import numpy as np')
    latency_normal_loop = timeit.timeit('[np.random.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')

    rng = np.random.default_rng()

    latency_generator = timeit.timeit('rng.uniform(size=(100,))', setup = 'import numpy as np')
    latency_generator_loop = timeit.timeit('[rng.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')

    print("latencies:\t normal: [{}, {}]\t generator: [{},{}]".format(latency_normal, latency_normal_loop, latency_generator, latency_generator_loop))

输出:

latencies:       normal: [2.7388298519999807, 31.694285575999857]        generator: [2.6634575979996953,31.0009219450003]

对于样本量较小的重复调用,是否有任何替代方法表现更好?

显然,函数调用有很大的固定 per-call 成本。要解决这个问题,您可以制作一个包装器,它将在一次调用中从 numpy(即 100)中检索一批随机数,然后从该缓存中检索 return 值。当缓存耗尽时,它会向 numpy 请求另外 100 个数字,等等

或者,您可以简单地使用 Python 的 random!