Numpy 随机数生成器延迟
Numpy random number generator latency
为什么 numpy 生成随机数在重复调用的情况下比单个函数调用慢得多?
示例:
import numpy as np
import timeit
if __name__ == '__main__':
latency_normal = timeit.timeit('np.random.uniform(size=(100,))', setup = 'import numpy as np')
latency_normal_loop = timeit.timeit('[np.random.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')
rng = np.random.default_rng()
latency_generator = timeit.timeit('rng.uniform(size=(100,))', setup = 'import numpy as np')
latency_generator_loop = timeit.timeit('[rng.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')
print("latencies:\t normal: [{}, {}]\t generator: [{},{}]".format(latency_normal, latency_normal_loop, latency_generator, latency_generator_loop))
输出:
latencies: normal: [2.7388298519999807, 31.694285575999857] generator: [2.6634575979996953,31.0009219450003]
对于样本量较小的重复调用,是否有任何替代方法表现更好?
显然,函数调用有很大的固定 per-call 成本。要解决这个问题,您可以制作一个包装器,它将在一次调用中从 numpy(即 100)中检索一批随机数,然后从该缓存中检索 return 值。当缓存耗尽时,它会向 numpy 请求另外 100 个数字,等等
或者,您可以简单地使用 Python 的 random
!
为什么 numpy 生成随机数在重复调用的情况下比单个函数调用慢得多?
示例:
import numpy as np
import timeit
if __name__ == '__main__':
latency_normal = timeit.timeit('np.random.uniform(size=(100,))', setup = 'import numpy as np')
latency_normal_loop = timeit.timeit('[np.random.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')
rng = np.random.default_rng()
latency_generator = timeit.timeit('rng.uniform(size=(100,))', setup = 'import numpy as np')
latency_generator_loop = timeit.timeit('[rng.uniform(size=1) for _ in range(100)]', setup = 'import numpy as np')
print("latencies:\t normal: [{}, {}]\t generator: [{},{}]".format(latency_normal, latency_normal_loop, latency_generator, latency_generator_loop))
输出:
latencies: normal: [2.7388298519999807, 31.694285575999857] generator: [2.6634575979996953,31.0009219450003]
对于样本量较小的重复调用,是否有任何替代方法表现更好?
显然,函数调用有很大的固定 per-call 成本。要解决这个问题,您可以制作一个包装器,它将在一次调用中从 numpy(即 100)中检索一批随机数,然后从该缓存中检索 return 值。当缓存耗尽时,它会向 numpy 请求另外 100 个数字,等等
或者,您可以简单地使用 Python 的 random
!