threading.Thread.start() 方法执行时间取决于线程目标方法

threading.Thread.start() method execution time depends on the Thread target method

我想知道为什么 threading.Thread().start() 方法的执行时间取决于 Thread 的目标方法。我假设 start() 方法只是通知系统 Thread 可以执行并且不做任何处理。

为了说明这种行为:

import numpy as np
from time import time
import threading


# Function which should take no time to execute
def pass_target_function(idx):
    pass


# Function which should take non-zero time to execute
def calculate_target_function(idx):
    a = my_arr[..., idx] * mask
    a = np.sum(a)


# Create data
my_arr_size = (1000000, 300)
my_arr = np.random.randint(255, size=my_arr_size)
mask = np.random.randint(255, size=my_arr.shape[0])


for target_function in [pass_target_function, calculate_target_function]:
    print(target_function.__name__)

    threads = []

    # Instantiate Threads
    st = time()
    for i in range(my_arr.shape[-1]):
        threads.append(threading.Thread(target=target_function, args=[i], daemon=True))
    print('\tThreads instantiated  in: {:.02f} ms'.format((time() - st) * 1000))

    # Run threads
    st = time()
    for thread in threads:
        thread.start()
    print('\tThreads started in: {:.02f} ms'.format((time() - st) * 1000))

    # Join threads
    st = time()
    for thread in threads:
        thread.join()
    print('\tThreads joined in: {:.02f} ms'.format((time() - st) * 1000))

结果:

pass_target_function
    Threads instantiated  in: 1.99 ms
    Threads started in: 105.72 ms
    Threads joined in: 1.00 ms
calculate_target_function
    Threads instantiated  in: 1.99 ms
    Threads started in: 1111.03 ms
    Threads joined in: 26.93 ms

为什么 pass_target_functioncalculate_target_function 的线程启动时间不同?

编辑:

根据史蒂夫的回答,测量了单个线程的启动时间。

# Run threads
thread_start_time = collections.deque()
for thread in threads:
    st_in = time()
    thread.start()
    thread_start_time.append(time() - st_in)

Steve 的回答中描述的结果匹配行为:

What you'll see is that the earlier calls complete very quickly, but as more and more threads have already been started, the time for start to run starts to take longer and longer, and the exact times are inconsistent.

每个 start 独立调用的时间,打印每个结果。你会看到前面的调用完成得非常快,但是随着越来越多的线程已经启动,start 到 运行 的时间开始越来越长,而确切的时间不一致。这是因为所有现有线程(包括主线程)都必须争用 CPU 时间。对 start 的调用可能会开始执行它的操作,但随后会被换出让其他线程 运行 运行一段时间。 GIL 使情况变得更糟,GIL 是 Python 用来通过不真正 运行 在 CPU 的多个内核上并行执行代码来保持线程安全的互斥体。 =14=]

无操作案例的 start 方法 运行 如此之快的原因是线程启动然后立即消失,因为它们实际上没有什么可到期的。这避免了在每个线程中有真正的工作要做时发生的 CPU 时间争用。