多线程生日悖论
Multi-thread the Birthday Paradox
我一直在玩 Birthday Problem,并想出了下面的代码。问题是,即使是 100 万次的小运行,它也会占用相当长的时间。由于 tick()
是最昂贵的函数,并且被很好地隔离(我认为),我将如何对 while
循环部分进行多线程处理?
来自 c++ openmp background, I would probably be using #pragma omp parallel
, but I don't know how to do that in python,尤其是 GIL
问题
还有其他方法可以提高性能吗?
ncalls tottime percall cumtime percall filename:lineno(function)
1000000 10.320 0.000 21.736 0.000 sim.py:29(tick)
24617823 6.931 0.000 9.209 0.000 sim.py:8(getday)
24617823 2.278 0.000 2.278 0.000 {method 'randint' of 'mtrand.RandomState' objects}
23617823 2.115 0.000 2.115 0.000 {method 'add' of 'set' objects}
1 0.983 0.983 22.967 22.967 sim.py:12(main)
999999 0.105 0.000 0.105 0.000 {method 'append' of 'list' objects}
import numpy as np
def getday():
return np.random.randint(1, 366)
def main():
# Multithread this section
popSize = [tick()]
while len(popSize) < 1000000:
popSize.append(tick())
print "Iterations %d" % len(popSize)
print "Average pop size %d" % (sum(popSize) / len(popSize))
print "Min pop size %d" % min(popSize)
print "Max pop size %d" % max(popSize)
def tick():
days = set()
day = getday()
while day not in days:
days.add(day)
day = getday()
return len(days)
if __name__ == '__main__':
main()
尝试此代码,其中 main() 下的 if
子句选择原始版本或 mp 版本...
import numpy as np
import multiprocessing as mp
def getday():
return np.random.randint(1, 366)
def main():
size = 1000000
if 0: # original version
popSize = [tick()]
while len(popSize) < size:
popSize.append(tick())
else: # multithreaded
pool = mp.Pool()
popSize = pool.map(tick, range(size))
print "Iterations %d" % len(popSize)
print "Average pop size %d" % (sum(popSize) / len(popSize))
print "Min pop size %d" % min(popSize)
print "Max pop size %d" % max(popSize)
def tick(dummy=None):
days = set()
day = getday()
while day not in days:
days.add(day)
day = getday()
return len(days)
if __name__ == '__main__':
main()
在我的 14 核机器上 运行-时间从 21 秒变为 1.75 秒,这相当接近理论最大值。
多处理中有几种不同的并行方法可供您尝试。请参阅 https://docs.python.org/2/library/multiprocessing.html 16.6.2.9。进程池
请注意,可选的 chunksize 参数会极大地影响性能。使用 map
似乎不需要加速,但如果使用 imap
,则需要将 chunksize
设置为 100 之类的值才能看到加速。
我一直在玩 Birthday Problem,并想出了下面的代码。问题是,即使是 100 万次的小运行,它也会占用相当长的时间。由于 tick()
是最昂贵的函数,并且被很好地隔离(我认为),我将如何对 while
循环部分进行多线程处理?
来自 c++ openmp background, I would probably be using #pragma omp parallel
, but I don't know how to do that in python,尤其是 GIL
问题
还有其他方法可以提高性能吗?
ncalls tottime percall cumtime percall filename:lineno(function)
1000000 10.320 0.000 21.736 0.000 sim.py:29(tick)
24617823 6.931 0.000 9.209 0.000 sim.py:8(getday)
24617823 2.278 0.000 2.278 0.000 {method 'randint' of 'mtrand.RandomState' objects}
23617823 2.115 0.000 2.115 0.000 {method 'add' of 'set' objects}
1 0.983 0.983 22.967 22.967 sim.py:12(main)
999999 0.105 0.000 0.105 0.000 {method 'append' of 'list' objects}
import numpy as np
def getday():
return np.random.randint(1, 366)
def main():
# Multithread this section
popSize = [tick()]
while len(popSize) < 1000000:
popSize.append(tick())
print "Iterations %d" % len(popSize)
print "Average pop size %d" % (sum(popSize) / len(popSize))
print "Min pop size %d" % min(popSize)
print "Max pop size %d" % max(popSize)
def tick():
days = set()
day = getday()
while day not in days:
days.add(day)
day = getday()
return len(days)
if __name__ == '__main__':
main()
尝试此代码,其中 main() 下的 if
子句选择原始版本或 mp 版本...
import numpy as np
import multiprocessing as mp
def getday():
return np.random.randint(1, 366)
def main():
size = 1000000
if 0: # original version
popSize = [tick()]
while len(popSize) < size:
popSize.append(tick())
else: # multithreaded
pool = mp.Pool()
popSize = pool.map(tick, range(size))
print "Iterations %d" % len(popSize)
print "Average pop size %d" % (sum(popSize) / len(popSize))
print "Min pop size %d" % min(popSize)
print "Max pop size %d" % max(popSize)
def tick(dummy=None):
days = set()
day = getday()
while day not in days:
days.add(day)
day = getday()
return len(days)
if __name__ == '__main__':
main()
在我的 14 核机器上 运行-时间从 21 秒变为 1.75 秒,这相当接近理论最大值。
多处理中有几种不同的并行方法可供您尝试。请参阅 https://docs.python.org/2/library/multiprocessing.html 16.6.2.9。进程池
请注意,可选的 chunksize 参数会极大地影响性能。使用 map
似乎不需要加速,但如果使用 imap
,则需要将 chunksize
设置为 100 之类的值才能看到加速。