Select NumPy 数组中每个唯一元素一个随机索引，并说明参考数组中缺失的元素

Question

如果我有以下

import numpy as np

mid_img = np.array([[0, 0, 1],
                    [2, 0, 2],
                    [3, 1, 0]])

values = np.array([0, 1, 2, 3, 4])              

locations = np.full((len(values), 2), [-1, -1])
locations[np.argwhere(mid_img == values)] = mid_img  # this of course doesn't work, but hopefully shows intent

'locations' 看起来像这样（仅显示为解释的中间步骤。不需要获取此输出。

[[[0, 0], [0, 1], [1, 1], [2, 2]],  #ie, locations matching values[0]
 [[0, 2], [2, 1]],                  #ie, locations matching values[1]
 [[1, 0], [1, 2]],                  #ie, locations matching values[2]
 [[2, 0]]]                          #ie, locations matching values[3]
 [[-1, -1]]]                        #ie, values[4] not found

然后最终输出将随机 select 每个值行的位置：

print locations

输出：

[[0, 1],
 [2, 1],
 [1, 0],
 [2, 0],
 [-1, -1]

这是该过程的循环版本：

for row_index in np.arange(0, len(values)):
    found_indices = np.argwhere(mid_img == row_index)
    try:
        locations[row_index] = found_indices[np.random.randint(len(found_indices))]
    except ValueError:
        pass

Answer 1

这是一种矢量化方式 -

# Get flattened sort indices for input array
idx = mid_img.ravel().argsort()

# Get counts for all uniqe elements
c = np.bincount(mid_img.flat)
c = c[c>0]

# Get bins to be used with searchsorted later on so that we select 
# exactly one unique index per group. These would be linear indices
bins = np.repeat(1.0/c,c).cumsum()
n = len(c)
sidx = np.searchsorted(bins,np.random.rand(n)+np.arange(n))
out_lidx = idx[sidx]

# Convert to row-col index format
row,col = np.unravel_index(out_lidx, mid_img.shape)

# Initialize output array
locations = np.full((len(values), 2), [-1, -1])

# Get valid ones based on values and indexed output
valid = values <= mid_img[row[-1],col[-1]]

# Finally assign row, col indices into final output
locations[valid,0] = row
locations[valid,1] = col

Select NumPy 数组中每个唯一元素一个随机索引，并说明参考数组中缺失的元素

Select one random index per unique element in NumPy array and account for missing ones from reference array

python

numpy

image-processing

python-2.7

array-broadcasting