如何反洗牌数据？

Question

可能存在从函数 shuffle 从 sklearn.utils 返回的方法？我更好地解释了我的问题：我使用 shuffle 函数随机化两个矩阵的行：

A_s, B_s = shuffle(A, B, random_state = 1)

接下来，我在某些操作中使用了两个矩阵 A_s、B_s，并获得了另一个具有相同维度的矩阵 C_s：例如C_s = f(A_s, B_s)。如何回到 C 原来的顺序为 A 和 B？

我在想类似于 sklearn.preprocessing.MinMaxScaler((0,+1)) 的东西，在我回来后使用 sklearn.inverse_transform()。

Answer 1

不一定可以，看你的选择f。如果 f 是可逆的，并且您跟踪行被打乱的方式，即使效率不高，也是可能的。 sklearn.utils 洗牌方法不 "keep track" 矩阵洗牌的方式。你可能想自己动手。要生成随机洗牌，请生成 range(len(A)) 的随机排列，然后按该顺序迭代交换行。要检索原始矩阵，您只需反转排列即可。这将允许您为 f（例如矩阵加法）

的某些选择恢复 C

（编辑，OP 要求提供更多信息）

这对我有用，但可能还有更有效的方法：

import numpy as np

def shuffle(A,axis=0,permutation=None):
    A = np.swapaxes(A,0,axis)
    if permutation is None:
        permutation = np.random.permutation(len(A))
    temp = np.copy(A[permutation[0]])
    for i in range(len(A)-1):
        A[permutation[i]] = A[permutation[i+1]]
    A[permutation[-1]] = temp
    A = np.swapaxes(A,0,axis)
    return A, permutation

A = np.array([[1,2],[3,4],[5,6],[7,8]])
print A
B, p = shuffle(A) #NOTE: shuffle is in place, so A is the same object as B!!!!
print "shuffle A"
print B
D, _ = shuffle(B,permutation=p[::-1])
print "unshuffle B to get A"
print D

B = np.copy(B)
C = A+B
print "A+B"
print C

A_s, p = shuffle(A)
B_s, _ = shuffle(B, permutation = p)
C_s = A_s + B_s

print "shuffle A and B, then add"
print C_s

print "unshuffle that to get the original sum"
CC, _ = shuffle(C_s, permutation=p[::-1])
print CC

Answer 2

import numpy as np


def shuffle(x):
    x_s = x.copy()
    x_s.insert(0, x_s.pop())
    return x_s


def unshuffle(x, shuffle):
    shuffled_ind = shuffle(list(range(len(x))))
    rev_shuffled_ind = np.argsort(shuffled_ind)
    x_unshuffled = np.array(x)[rev_shuffled_ind].tolist()
    return x_unshuffled


x = [1, 2, 3, 4, 5, 6, 7]
x_s = shuffle(x)
print(x_s)
x_r = unshuffle(x_s, shuffle)
print(x_r)

此处回复较晚。

实际上，您有自己的 shuffle() 函数。

想法是将序列打乱，然后使用np.argsoft() 获取索引进行打乱。

希望对您有所帮助！

如何反洗牌数据？

How to un-shuffle data?

python

shuffle

scikit-learn