Select NumPy 数组中每个唯一元素一个随机索引,并说明参考数组中缺失的元素
Select one random index per unique element in NumPy array and account for missing ones from reference array
如果我有以下
import numpy as np
mid_img = np.array([[0, 0, 1],
[2, 0, 2],
[3, 1, 0]])
values = np.array([0, 1, 2, 3, 4])
locations = np.full((len(values), 2), [-1, -1])
locations[np.argwhere(mid_img == values)] = mid_img # this of course doesn't work, but hopefully shows intent
'locations' 看起来像这样(仅显示为解释的中间步骤。不需要获取此输出。
[[[0, 0], [0, 1], [1, 1], [2, 2]], #ie, locations matching values[0]
[[0, 2], [2, 1]], #ie, locations matching values[1]
[[1, 0], [1, 2]], #ie, locations matching values[2]
[[2, 0]]] #ie, locations matching values[3]
[[-1, -1]]] #ie, values[4] not found
然后最终输出将随机 select 每个值行的位置:
print locations
输出:
[[0, 1],
[2, 1],
[1, 0],
[2, 0],
[-1, -1]
这是该过程的循环版本:
for row_index in np.arange(0, len(values)):
found_indices = np.argwhere(mid_img == row_index)
try:
locations[row_index] = found_indices[np.random.randint(len(found_indices))]
except ValueError:
pass
这是一种矢量化方式 -
# Get flattened sort indices for input array
idx = mid_img.ravel().argsort()
# Get counts for all uniqe elements
c = np.bincount(mid_img.flat)
c = c[c>0]
# Get bins to be used with searchsorted later on so that we select
# exactly one unique index per group. These would be linear indices
bins = np.repeat(1.0/c,c).cumsum()
n = len(c)
sidx = np.searchsorted(bins,np.random.rand(n)+np.arange(n))
out_lidx = idx[sidx]
# Convert to row-col index format
row,col = np.unravel_index(out_lidx, mid_img.shape)
# Initialize output array
locations = np.full((len(values), 2), [-1, -1])
# Get valid ones based on values and indexed output
valid = values <= mid_img[row[-1],col[-1]]
# Finally assign row, col indices into final output
locations[valid,0] = row
locations[valid,1] = col
如果我有以下
import numpy as np
mid_img = np.array([[0, 0, 1],
[2, 0, 2],
[3, 1, 0]])
values = np.array([0, 1, 2, 3, 4])
locations = np.full((len(values), 2), [-1, -1])
locations[np.argwhere(mid_img == values)] = mid_img # this of course doesn't work, but hopefully shows intent
'locations' 看起来像这样(仅显示为解释的中间步骤。不需要获取此输出。
[[[0, 0], [0, 1], [1, 1], [2, 2]], #ie, locations matching values[0]
[[0, 2], [2, 1]], #ie, locations matching values[1]
[[1, 0], [1, 2]], #ie, locations matching values[2]
[[2, 0]]] #ie, locations matching values[3]
[[-1, -1]]] #ie, values[4] not found
然后最终输出将随机 select 每个值行的位置:
print locations
输出:
[[0, 1],
[2, 1],
[1, 0],
[2, 0],
[-1, -1]
这是该过程的循环版本:
for row_index in np.arange(0, len(values)):
found_indices = np.argwhere(mid_img == row_index)
try:
locations[row_index] = found_indices[np.random.randint(len(found_indices))]
except ValueError:
pass
这是一种矢量化方式 -
# Get flattened sort indices for input array
idx = mid_img.ravel().argsort()
# Get counts for all uniqe elements
c = np.bincount(mid_img.flat)
c = c[c>0]
# Get bins to be used with searchsorted later on so that we select
# exactly one unique index per group. These would be linear indices
bins = np.repeat(1.0/c,c).cumsum()
n = len(c)
sidx = np.searchsorted(bins,np.random.rand(n)+np.arange(n))
out_lidx = idx[sidx]
# Convert to row-col index format
row,col = np.unravel_index(out_lidx, mid_img.shape)
# Initialize output array
locations = np.full((len(values), 2), [-1, -1])
# Get valid ones based on values and indexed output
valid = values <= mid_img[row[-1],col[-1]]
# Finally assign row, col indices into final output
locations[valid,0] = row
locations[valid,1] = col