通过多线程和多处理并行化比串行化花费更多的时间
Parallelizing through Multi-threading and Multi-processing taking significantly more time than serial
我正在尝试学习如何在 python 中进行并行编程。我写了一个简单的 int square 函数,然后 运行 它在串行、多线程和多进程中:
import time
import multiprocessing, threading
import random
def calc_square(numbers):
sq = 0
for n in numbers:
sq = n*n
def splita(list, n):
a = [[] for i in range(n)]
counter = 0
for i in range(0,len(list)):
a[counter].append(list[i])
if len(a[counter]) == len(list)/n:
counter = counter +1
continue
return a
if __name__ == "__main__":
random.seed(1)
arr = [random.randint(1, 11) for i in xrange(1000000)]
print "init completed"
start_time2 = time.time()
calc_square(arr)
end_time2 = time.time()
print "serial: " + str(end_time2 - start_time2)
newarr = splita(arr,8)
print 'split complete'
start_time = time.time()
for i in range(8):
t1 = threading.Thread(target=calc_square, args=(newarr[i],))
t1.start()
t1.join()
end_time = time.time()
print "mt: " + str(end_time - start_time)
start_time = time.time()
for i in range(8):
p1 = multiprocessing.Process(target=calc_square, args=(newarr[i],))
p1.start()
p1.join()
end_time = time.time()
print "mp: " + str(end_time - start_time)
输出:
init completed
serial: 0.0640001296997
split complete
mt: 0.0599999427795
mp: 2.97099995613
但是,如您所见,发生了一些奇怪的事情,mt 花费的时间与 serial 花费的时间相同,而 mp 实际上花费的时间明显更长(几乎长了 50 倍)。
我做错了什么?有人能指导我在 python 中学习并行编程的正确方向吗?
编辑 01
看了评论,我发现也许不返回任何东西的函数似乎毫无意义。我什至尝试这样做的原因是因为之前我尝试过以下添加功能:
def addi(numbers):
sq = 0
for n in numbers:
sq = sq + n
return sq
我尝试将每个部分的加法返回到序列号加法器,因此至少我可以看到比纯串行实现有一些性能改进。但是,我无法弄清楚如何存储和使用返回值,这就是我试图找出比这更简单的东西的原因,它只是划分数组和 运行 一个简单的函数就可以了。
谢谢!
我认为 multiprocessing
需要相当长的时间来创建和启动每个进程。我已将程序更改为 arr
大小的 10 倍,并更改了进程启动的方式,并略有加速:
(另请注意python 3)
import time
import multiprocessing, threading
from multiprocessing import Queue
import random
def calc_square_q(numbers,q):
while q.empty():
pass
return calc_square(numbers)
if __name__ == "__main__":
random.seed(1) # note how big arr is now vvvvvvv
arr = [random.randint(1, 11) for i in range(10000000)]
print("init completed")
# ...
# other stuff as before
# ...
processes=[]
q=Queue()
for arrs in newarr:
processes.append(multiprocessing.Process(target=calc_square_q, args=(arrs,q)))
print('start processes')
for p in processes:
p.start() # even tho' each process is started it waits...
print('join processes')
q.put(None) # ... for q to become not empty.
start_time = time.time()
for p in processes:
p.join()
end_time = time.time()
print("mp: " + str(end_time - start_time))
还要注意上面我如何在两个不同的循环中创建和启动进程,然后最终在第三个循环中加入进程。
输出:
init completed
serial: 0.53214430809021
split complete
start threads
mt: 0.5551605224609375
start processes
join processes
mp: 0.2800724506378174
arr
大小增加 10 的另一个因素:
init completed
serial: 5.8455305099487305
split complete
start threads
mt: 5.411392450332642
start processes
join processes
mp: 1.9705185890197754
是的,我也在 python 2.7 中尝试过,尽管 Threads
似乎更慢。
我正在尝试学习如何在 python 中进行并行编程。我写了一个简单的 int square 函数,然后 运行 它在串行、多线程和多进程中:
import time
import multiprocessing, threading
import random
def calc_square(numbers):
sq = 0
for n in numbers:
sq = n*n
def splita(list, n):
a = [[] for i in range(n)]
counter = 0
for i in range(0,len(list)):
a[counter].append(list[i])
if len(a[counter]) == len(list)/n:
counter = counter +1
continue
return a
if __name__ == "__main__":
random.seed(1)
arr = [random.randint(1, 11) for i in xrange(1000000)]
print "init completed"
start_time2 = time.time()
calc_square(arr)
end_time2 = time.time()
print "serial: " + str(end_time2 - start_time2)
newarr = splita(arr,8)
print 'split complete'
start_time = time.time()
for i in range(8):
t1 = threading.Thread(target=calc_square, args=(newarr[i],))
t1.start()
t1.join()
end_time = time.time()
print "mt: " + str(end_time - start_time)
start_time = time.time()
for i in range(8):
p1 = multiprocessing.Process(target=calc_square, args=(newarr[i],))
p1.start()
p1.join()
end_time = time.time()
print "mp: " + str(end_time - start_time)
输出:
init completed
serial: 0.0640001296997
split complete
mt: 0.0599999427795
mp: 2.97099995613
但是,如您所见,发生了一些奇怪的事情,mt 花费的时间与 serial 花费的时间相同,而 mp 实际上花费的时间明显更长(几乎长了 50 倍)。
我做错了什么?有人能指导我在 python 中学习并行编程的正确方向吗?
编辑 01
看了评论,我发现也许不返回任何东西的函数似乎毫无意义。我什至尝试这样做的原因是因为之前我尝试过以下添加功能:
def addi(numbers):
sq = 0
for n in numbers:
sq = sq + n
return sq
我尝试将每个部分的加法返回到序列号加法器,因此至少我可以看到比纯串行实现有一些性能改进。但是,我无法弄清楚如何存储和使用返回值,这就是我试图找出比这更简单的东西的原因,它只是划分数组和 运行 一个简单的函数就可以了。
谢谢!
我认为 multiprocessing
需要相当长的时间来创建和启动每个进程。我已将程序更改为 arr
大小的 10 倍,并更改了进程启动的方式,并略有加速:
(另请注意python 3)
import time
import multiprocessing, threading
from multiprocessing import Queue
import random
def calc_square_q(numbers,q):
while q.empty():
pass
return calc_square(numbers)
if __name__ == "__main__":
random.seed(1) # note how big arr is now vvvvvvv
arr = [random.randint(1, 11) for i in range(10000000)]
print("init completed")
# ...
# other stuff as before
# ...
processes=[]
q=Queue()
for arrs in newarr:
processes.append(multiprocessing.Process(target=calc_square_q, args=(arrs,q)))
print('start processes')
for p in processes:
p.start() # even tho' each process is started it waits...
print('join processes')
q.put(None) # ... for q to become not empty.
start_time = time.time()
for p in processes:
p.join()
end_time = time.time()
print("mp: " + str(end_time - start_time))
还要注意上面我如何在两个不同的循环中创建和启动进程,然后最终在第三个循环中加入进程。
输出:
init completed
serial: 0.53214430809021
split complete
start threads
mt: 0.5551605224609375
start processes
join processes
mp: 0.2800724506378174
arr
大小增加 10 的另一个因素:
init completed
serial: 5.8455305099487305
split complete
start threads
mt: 5.411392450332642
start processes
join processes
mp: 1.9705185890197754
是的,我也在 python 2.7 中尝试过,尽管 Threads
似乎更慢。