连接大型 numpy 数组的最快方法

fastest way to concatenate large numpy arrays

我正在做一些光流分析。目标是遍历长电影中的每一帧,计算密集光流,并将所得角度和幅度附加到不断增长的 numpy 数组。我发现完成每个连续循环的时间越来越长,我不确定为什么。这是一个概括问题的简单示例循环:

import numpy as np

arraySize = (1, 256, 256)          # correct array size
emptyArray = np.zeros(arraySize)   # empty array to fill with angles from every image pair
timeElapsed = []                   # empty list to fill with time values

for i in range(100):               # iterates through the frames in the image stack
    start = time.time()            # start the time
    newArray = np.zeros(arraySize) # makes an example new array
    emptyArray = np.concatenate((emptyArray, newArray)) # concats new and growing arrays
    end = time.time()              # stop the time
    timeElapsed.append(end-start)  # append the total time for the loop to the growing list

如果我然后绘制每个循环经过的时间,我会得到每次循环的线性增加。在这个例子中它仍然是可以容忍的,但对于我的实际数据集它不是。

我猜测较大的阵列需要更多时间来处理,但我不确定如何避免这种情况。有没有更好、更快或更 Pythonic 的方法来做到这一点?

------------编辑------------

根据 mathfux 的建议:我将循环修改如下:

arraySize = (1, 256, 256)          # correct array size
emptyArray = np.concatenate([np.zeros(arraySize) for i in range(100)])   # empty array to fill with angles from every image pair
timeElapsed = []                   # empty list to fill with time values

for i in range(100):               # iterates through the frames in the image stack
    start = time.time()            # start the time
    newArray = np.zeros(arraySize) # makes an example new array
    emptyArray[i] = newArray[0]    # overwrites empty array with newarray values at the relevant position
    end = time.time()              # stop the time
    timeElapsed.append(end-start)  # append the total time for the loop to the growing list

现在 time/loop 在迭代之间非常一致:

谢谢!

每次附加一个新数组时,都会分配新内存以创建更大的数组并将数据记录到其中。这是非常昂贵的。更好的解决方案是分配一次特定大小的内存,然后仅使用一次 np.concatenate 记录您的日期:

np.concatenate([np.zeros(arraySize) for i in range(100)])

这种方式在我的电脑上似乎快了 28 倍

start = time.time()                    # start the time
arrays = []
for i in range(100):                   # iterates through the frames in the image stack
    arrays.append(np.zeros(arraySize)) 

#Concatenate all in one time     
newArray=np.concatenate(arrays)
end = time.time()              # stop the time
timeElapsed2 = end-start  

print("Elapesed:",timeElapsed2)

print("sum elapsed times of first method:", np.sum(timeElapsed))

已用:0.021436214447021484

第一种方法的总运行时间:0.6163454055786133

使用加速器可以通过使用 GPU 或 TPU 功能提高代码速度,例如通过使用 jax 库,您的代码将 运行 或 about 1000 times faster than other answers (每个循环大约 40 到 50 µs)使用 google colab TPU:

from jax import jit

@jit
def zac():
    arraySize = (1, 256, 256)          # correct array size
    emptyArray = np.zeros(arraySize)   # empty array to fill with angles from every image pair
    timeElapsed = []                   # empty list to fill with time values

    for i in range(100):               # iterates through the frames in the image stack
        start = time.time()            # start the time
        newArray = np.zeros(arraySize) # makes an example new array
        emptyArray = np.concatenate((emptyArray, newArray)) # concats new and growing arrays
        end = time.time()              # stop the time
        timeElapsed.append(end-start)  # append the total time for the loop to the growing list

%timeit -n10000 zac()计算的结果如下:

10000 loops, best of 5: 47.7 µs per loop