二维 numpy 数组的所有可能组合

Question

我有四个 numpy 数组，示例如下：

a1=np.array([[-24.4925, 295.77  ],
             [-24.4925, 295.77  ],
             [-14.3925, 295.77  ],
             [-16.4125, 295.77  ],
             [-43.6825, 295.77  ],
             [-22.4725, 295.77  ]])

a2=np.array([[-26.0075, 309.39  ],
             [-24.9975, 309.39  ],
             [-14.8975, 309.39  ],
             [-17.9275, 309.39  ],
             [-46.2075, 309.39  ],
             [-23.9875, 309.39  ]])

a3=np.array([[-25.5025, 310.265 ],
             [-25.5025, 310.265 ],
             [-15.4025, 310.265 ],
             [-17.4225, 310.265 ],
             [-45.7025, 310.265 ],
             [-24.4925, 310.265 ]])

a4=np.array([[-27.0175, 326.895 ],
             [-27.0175, 326.895 ],
             [-15.9075, 326.895 ],
             [-18.9375, 326.895 ],
             [-48.2275, 326.895 ],
             [-24.9975, 326.895 ]])

我想做数组之间所有可能的组合，同时拼接，例如：

array[-24.4925, 295.77, -26.0075, 309.39, -25.5025, 310.265, -27.0175, 326.895]

和

array[-24.4925, 295.77, -26.0075, 309.39, -25.5025, 310.265, -27.0175, 326.895]

即[a1[0],a2[0],a3[0],a4[0]]、[a1[0],a2[0],a3[0],a4[1]]等

除了遍历四个数组外，最快的方法是什么？！

Answer 1

好吧，没有比循环更快的方法了，但是有一种无需编写循环的简洁方法：

import numpy as np
import itertools

a1=np.array([[-24.4925, 295.77  ],
             [-24.4925, 295.77  ],
             [-14.3925, 295.77  ],
             [-16.4125, 295.77  ],
             [-43.6825, 295.77  ],
             [-22.4725, 295.77  ]])

a2=np.array([[-26.0075, 309.39  ],
             [-24.9975, 309.39  ],
             [-14.8975, 309.39  ],
             [-17.9275, 309.39  ],
             [-46.2075, 309.39  ],
             [-23.9875, 309.39  ]])

a3=np.array([[-25.5025, 310.265 ],
             [-25.5025, 310.265 ],
             [-15.4025, 310.265 ],
             [-17.4225, 310.265 ],
             [-45.7025, 310.265 ],
             [-24.4925, 310.265 ]])

a4=np.array([[-27.0175, 326.895 ],
             [-27.0175, 326.895 ],
             [-15.9075, 326.895 ],
             [-18.9375, 326.895 ],
             [-48.2275, 326.895 ],
             [-24.9975, 326.895 ]])

arrays = [a1, a2, a3, a4]

for pieces in itertools.product(*arrays):
    combined = np.concatenate(pieces, axis = 0)
    print(combined)

标准库 itertools 模块 (https://docs.python.org/3/library/itertools.html) 提供了多种工具来生成可迭代对象的乘积、组合、排列等。由于 numpy 数组恰好是可迭代的（迭代第一个索引），我们可以使用 itertools 从每个数组中获取切片，然后使用 numpy 将它们组合起来。

Answer 2

这是一个 numpy 解决方案，基于 here 的笛卡尔积实现。

arr = np.stack([a1, a2, a3, a4])

print(arr.shape) # (4, 6, 2)
n, m, k = arr.shape

# from 
def cartesian_product(*arrays):
    la = len(arrays)
    dtype = np.result_type(*arrays)
    arr = np.empty([len(a) for a in arrays] + [la], dtype=dtype)
    for i, a in enumerate(np.ix_(*arrays)):
        arr[...,i] = a
    return arr.reshape(-1, la)

inds = cartesian_product(*([np.arange(m)] * n))
res = np.take_along_axis(arr, inds.T[...,None], 1).swapaxes(0,1).reshape(-1, n*k)

print(res[0])
# [-24.4925 295.77   -26.0075 309.39   -25.5025 310.265  -27.0175 326.895 ]

在此示例中，inds 数组如下所示：

print(inds[:10])
# [[0 0 0 0]
#  [0 0 0 1]
#  [0 0 0 2]
#  [0 0 0 3]
#  [0 0 0 4]
#  [0 0 0 5]
#  [0 0 1 0]
#  [0 0 1 1]
#  [0 0 1 2]
#  [0 0 1 3]]

然后我们可以为每个组合使用 np.take_along_axis 到 select 适当的元素。

二维 numpy 数组的所有可能组合

All possible combinations of 2D numpy array

python

numpy

numpy-ufunc