为什么多处理中的池 class 没有比线性处理更具优势
Why pool class in multiprocessing is not giving advantage over linear processing
我正在尝试衡量 池 class 在 多处理模块 中相对于正常编程的优势,我正在计算平方使用函数的数字。现在,当我计算找到所有三个数字的平方所花费的时间时,它大约需要 ~0.24 秒 但是当我在 for 循环中正常计算它时,它所花费的时间甚至更少 ~0.007 秒。这是为什么?带池的代码部分不应该更快吗?
import time
from multiprocessing import Pool,Process
def f(x):
return x*x
if __name__ == '__main__':
start = time.time()
array = []
for i in range(1000000):
array.append(i)
with Pool(4) as p:
(p.map(f, array))
print(time.time()-start) # time taken when using pool
start1 = time.time()
for i in range(1000000):
f(array[i])
print(time.time()-start1) # time taken normaly
因此,正如@klaus D. 和@wwii 所建议的那样,我没有足够的计算来克服生成进程的开销和进程之间切换所花费的时间。
下面是更新的代码以注意差异。希望对你有帮助
import multiprocessing
import time
import random
from multiprocessing import Pool,Process
def f(x):
time.sleep(3)
if __name__ == '__main__':
array = []
for i in range(4):
array.append(i)
start = time.time()
with Pool(4) as p:
(p.map(f, array))
print(time.time()-start) # time taken when using pool
start1 = time.time()
for i in range(4):
f(array[i])
print(time.time()-start1) # time taken normaly
问题是你的池中工人的功能太简单了,无法通过并行改进:
试试这个:
import time
from multiprocessing import Pool,Process
N = 80
M = 1_000_000
def f_std(array):
"""
Calculate Standard deviation
"""
mean = sum(array)/len(array)
std = ((sum(map(lambda x: (x-mean)**2, array)))/len(array))**.5
return std
if __name__ == '__main__':
array = []
for i in range(N):
array.append(range(M))
start = time.time()
with Pool(8) as p:
(p.map(f_std, array))
print(time.time()-start) # time taken when using pool
start1 = time.time()
for i in range(N):
f_std(array[i])
print(time.time()-start1) # time taken normaly
我正在尝试衡量 池 class 在 多处理模块 中相对于正常编程的优势,我正在计算平方使用函数的数字。现在,当我计算找到所有三个数字的平方所花费的时间时,它大约需要 ~0.24 秒 但是当我在 for 循环中正常计算它时,它所花费的时间甚至更少 ~0.007 秒。这是为什么?带池的代码部分不应该更快吗?
import time
from multiprocessing import Pool,Process
def f(x):
return x*x
if __name__ == '__main__':
start = time.time()
array = []
for i in range(1000000):
array.append(i)
with Pool(4) as p:
(p.map(f, array))
print(time.time()-start) # time taken when using pool
start1 = time.time()
for i in range(1000000):
f(array[i])
print(time.time()-start1) # time taken normaly
因此,正如@klaus D. 和@wwii 所建议的那样,我没有足够的计算来克服生成进程的开销和进程之间切换所花费的时间。 下面是更新的代码以注意差异。希望对你有帮助
import multiprocessing
import time
import random
from multiprocessing import Pool,Process
def f(x):
time.sleep(3)
if __name__ == '__main__':
array = []
for i in range(4):
array.append(i)
start = time.time()
with Pool(4) as p:
(p.map(f, array))
print(time.time()-start) # time taken when using pool
start1 = time.time()
for i in range(4):
f(array[i])
print(time.time()-start1) # time taken normaly
问题是你的池中工人的功能太简单了,无法通过并行改进:
试试这个:
import time
from multiprocessing import Pool,Process
N = 80
M = 1_000_000
def f_std(array):
"""
Calculate Standard deviation
"""
mean = sum(array)/len(array)
std = ((sum(map(lambda x: (x-mean)**2, array)))/len(array))**.5
return std
if __name__ == '__main__':
array = []
for i in range(N):
array.append(range(M))
start = time.time()
with Pool(8) as p:
(p.map(f_std, array))
print(time.time()-start) # time taken when using pool
start1 = time.time()
for i in range(N):
f_std(array[i])
print(time.time()-start1) # time taken normaly