如何按 python 中特定值的特定索引对列表元素进行排序

Question

我有一个包含 500000 个 3D 坐标和 500000 个 rgb 补丁的数据集。首先，我想在从数据集中删除一些错误数据后，对 3D 坐标进行排序，使其在开始时具有最相似的值。为此，我进行了如下操作：

# Loading saved Dataset
print("Begin Loding Dataset ...")
X = np.load('train_RGB-Patches_Fire_seq01.npy')
Y = np.load('train_3DPoses-Patches_Fire_seq01.npy')
print("End Loading Dataset => Shapes : ",X.shape, Y.shape)
print("max - min 3D originals : ",Y.max()," , ", Y.min())
print("Begin Correcting 3D_Patches Dataset ...")
Y_faux_indx = np.unique(np.argwhere(Y>15)[:,0].reshape((-1,1)))
Y_correct = np.delete(Y,Y_faux_indx,0)
X_correct = np.delete(X,Y_faux_indx,0)
Y_sorted = np.array(sorted(Y_correct.tolist())).reshape((-1,4))
print("End Correcting 3D_Patches Dataset ...")

现在根据排序好的3D标签，我想从之前未排序的校正数据中得到这些排序数据的索引，然后根据这些索引对rgb数据进行排列。为此，我编写了这段需要很长时间才能执行的代码：

print("Begin Sorting 3D_Patches Dataset ...")
sorted_dataset_indx = []
for j in range(len(Y_sorted)):
    element_verification = Y_correct == Y_sorted[j]
    for i in range(len(element_verification)):
        if element_verification[i].prod()==1:
            if i not in sorted_dataset_indx:
                sorted_dataset_indx.append(i)
 sorted_dataset_indx = np.array(sorted_dataset_indx)
 X_sorted = X_correct[sorted_dataset_indx]
 print("End Sorting 3D_Patches Dataset => Shapes : ",X_sorted.shape,Y_sorted.shape)
 print("max - min 3D new : ",Y_sorted.max()," , ", Y_sorted.min())

所以我想要另一种解决方案来帮助我更快地执行此操作？

Answer 1

使用循环——尤其是嵌套循环——通常会很慢，应该尽可能避免。
显然我无法重现您的代码，但这是一个适应您的用例的玩具示例：

list1 = [1, 10, 2, 3, 5, 0, 1.5]
list2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
enumerated_list1 = list(enumerate(list1))
enumerated_list1.sort(key=lambda x: x[1])
sorting_inds = [x[0] for x in enumerated_list1]
list2_sorted_by_list1 = [list2[i] for i in sorting_inds]

如何按 python 中特定值的特定索引对列表元素进行排序

How to sort list elements by specific indexes of specific values in python

python

arrays

sorting

performance

dataset