threading 模块和 multiprocessing 模块的比较
Comparison between threading module and multiprocessing module
所以我想比较 threading 更快还是 multiprocessing。理论上,由于 GIL,多处理应该比多线程更快,因为一次只有一个线程运行。但是我得到了 相反 结果,即线程比多处理花费的时间更少,我错过了什么请帮忙。
下面是threading
的代码
import threading
from queue import Queue
import time
print_lock = threading.Lock()
def exampleJob(worker):
time.sleep(10)
with print_lock:
print(threading.current_thread().name,worker)
def threader():
while True:
worker = q.get()
exampleJob(worker)
q.task_done()
q = Queue()
for x in range(4):
t = threading.Thread(target=threader)
print(x)
t.daemon = True
t.start()
start = time.time()
for worker in range(8):
q.put(worker)
q.join()
print('Entire job took:',time.time() - start)
下面是multiprocessing
的代码
import multiprocessing as mp
import time
def exampleJob(print_lock,worker): # function simulating some computation
time.sleep(10)
with print_lock:
print(mp.current_process().name,worker)
def processor(print_lock,q): # function where process pick up the job
while True:
worker = q.get()
if worker is None: # flag to exit the process
break
exampleJob(print_lock,worker)
if __name__ == '__main__':
print_lock = mp.Lock()
q = mp.Queue()
processes = [mp.Process(target=processor,args=(print_lock,q)) for _ in range(4)]
for process in processes:
process.start()
start = time.time()
for worker in range(8):
q.put(worker)
for process in processes:
q.put(None) # quit indicator
for process in processes:
process.join()
print('Entire job took:',time.time() - start)
这不是一个正确的测试。 time.sleep
可能不会获取 GIL,因此您是 运行 并发线程与并发进程。由于没有启动成本,线程速度更快。
您应该在您的线程中执行一些计算,然后您就会看到差异。
只有在执行计算密集型任务时,添加到@zmbq 线程才会变慢,因为存在 GIL。如果您的操作是 I/O 绑定的并且其他类似操作很少,那么线程肯定会更快,因为涉及的开销更少。请参阅以下博客以更好地理解。
Exploiting Multiprocessing and Multithreading in Python as a Data Scientist
希望对您有所帮助!
所以我想比较 threading 更快还是 multiprocessing。理论上,由于 GIL,多处理应该比多线程更快,因为一次只有一个线程运行。但是我得到了 相反 结果,即线程比多处理花费的时间更少,我错过了什么请帮忙。
下面是threading
的代码import threading
from queue import Queue
import time
print_lock = threading.Lock()
def exampleJob(worker):
time.sleep(10)
with print_lock:
print(threading.current_thread().name,worker)
def threader():
while True:
worker = q.get()
exampleJob(worker)
q.task_done()
q = Queue()
for x in range(4):
t = threading.Thread(target=threader)
print(x)
t.daemon = True
t.start()
start = time.time()
for worker in range(8):
q.put(worker)
q.join()
print('Entire job took:',time.time() - start)
下面是multiprocessing
的代码import multiprocessing as mp
import time
def exampleJob(print_lock,worker): # function simulating some computation
time.sleep(10)
with print_lock:
print(mp.current_process().name,worker)
def processor(print_lock,q): # function where process pick up the job
while True:
worker = q.get()
if worker is None: # flag to exit the process
break
exampleJob(print_lock,worker)
if __name__ == '__main__':
print_lock = mp.Lock()
q = mp.Queue()
processes = [mp.Process(target=processor,args=(print_lock,q)) for _ in range(4)]
for process in processes:
process.start()
start = time.time()
for worker in range(8):
q.put(worker)
for process in processes:
q.put(None) # quit indicator
for process in processes:
process.join()
print('Entire job took:',time.time() - start)
这不是一个正确的测试。 time.sleep
可能不会获取 GIL,因此您是 运行 并发线程与并发进程。由于没有启动成本,线程速度更快。
您应该在您的线程中执行一些计算,然后您就会看到差异。
只有在执行计算密集型任务时,添加到@zmbq 线程才会变慢,因为存在 GIL。如果您的操作是 I/O 绑定的并且其他类似操作很少,那么线程肯定会更快,因为涉及的开销更少。请参阅以下博客以更好地理解。
Exploiting Multiprocessing and Multithreading in Python as a Data Scientist
希望对您有所帮助!