使用 numpy reshape 执行三阶张量展开操作

Question

我正在尝试使用 numpy python 中的 reshape 命令对 3rd-rank/mode 张量执行展开操作。我不确定我在做什么是否正确。我在网上找到了这篇论文Tensor Decomposition. Also found this code: SVD Image Compression，其中作者写道：

Color images are represented in python as 3 dimensional numpy arrays — the third dimension to represent the color values (red,green blue). However, svd method is applicable to two dimensional matrices. So we have to find a way to convert the 3 dimensional array to 2 dimensional arrays, apply svd and reconstruct it back as a 3 dimensional array. There are two ways to do it. We will show both these methods below.

reshape method

Layer method

Reshape method to compress a color image:

This method involves flattening the third dimension of the image array into the second dimension using numpy’s reshape method .

image_reshaped = image.reshape((original_shape[0],original_shape[1]*3))

我正在尝试了解 reshape 方法。在我看来，这就像 3-rank/mode 张量上的展开操作。假设我有一个大小为 NxMxP 的数组，如果我使用以下 python 命令，我将展开哪种模式：reshape(N, M*P)?

下面是我测试展开操作的方法：

import cv2
import numpy as np

def m_unfold(thrd_order_tensor,m):
matrix = []
if m == 1:
    matrix = thrd_order_tensor.reshape((thrd_order_tensor.shape[0], thrd_order_tensor.shape[1]*3))
    #matrix = np.hstack([thrd_order_tensor[:, :, i] for i in range(thrd_order_tensor.shape[2])])
if m == 2:
    matrix = thrd_order_tensor.reshape((thrd_order_tensor.shape[1], thrd_order_tensor.shape[0]*3))
    #matrix = np.hstack([thrd_order_tensor[:, :, i].T for i in range(thrd_order_tensor.shape[2])])
if m == 3:
    matrix = thrd_order_tensor.reshape((3, thrd_order_tensor.shape[0]*thrd_order_tensor.shape[1]))
    #matrix = np.vstack([thrd_order_tensor[:, :, i].ravel() for i in range(thrd_order_tensor.shape[2])])
return matrix

def fold(matrix, os):
#os is the original shape
tensor = matrix.reshape(os)
return tensor

im = cv2.imread('target.jpg')
original_shape = im.shape
image_reshaped = m_unfold(im,3)
U, sig, V = LA.svd(image_reshaped, full_matrices=False)
img_restrd = np.dot(U[:,:], np.dot(np.diag(sig[:]), V[:,:]))
img_restrd = fold(img_restrd,original_shape)
img_restrd = img_restrd.astype(np.uint8)
cv2.imshow('image',img_restrd)
cv2.waitKey(0)
cv2.destroyAllWindows()

Answer 1

以下代码执行 3D 张量的三种可能的模态展开（符号和示例取自 this paper）：

In [224]: import numpy as np

In [225]: n1, n2, n3 = 3, 4, 2

In [226]: A = 1 + np.arange(n1*n2*n3).reshape(n3, n1, n2).transpose([1, 2, 0])

In [227]: A[:, :, 0]  # frontal slice 1
Out[227]: 
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [228]: A[:, :, 1]  # frontal slice 2
Out[228]: 
array([[13, 14, 15, 16],
       [17, 18, 19, 20],
       [21, 22, 23, 24]])

In [229]: A1 = np.hstack([A[:, :, i] for i in range(A.shape[2])])

In [230]: A1 # mode 1
Out[230]: 
array([[ 1,  2,  3,  4, 13, 14, 15, 16],
       [ 5,  6,  7,  8, 17, 18, 19, 20],
       [ 9, 10, 11, 12, 21, 22, 23, 24]])

In [231]: A2 = np.hstack([A[:, :, i].T for i in range(A.shape[2])])

In [232]: A2 # mode 2
Out[232]: 
array([[ 1,  5,  9, 13, 17, 21],
       [ 2,  6, 10, 14, 18, 22],
       [ 3,  7, 11, 15, 19, 23],
       [ 4,  8, 12, 16, 20, 24]])

In [233]: A3 = np.vstack([A[:, :, i].ravel() for i in range(A.shape[2])])

In [234]: A3 # mode 3
Out[234]: 
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

Answer 2

TL;DR: 假设您使用的是元素的默认 (C-) 排序，那么 tensor.reshape(N, M*P) 对应于展开根据例如 TensorLy 中使用的定义，沿其第一模式的张量。

长答案更微妙。展开的定义不止一种。一般来说，n 模式展开对应于 i) 将第 n 模式移动到开头和 ii) 将结果重塑为矩阵。这种重塑的方式为您提供了不同的展开定义。

首先是一些术语：沿着张量的第 n 个模式（即维度）的纤维是通过改变张量的第 n 个索引同时保持所有其他固定来获得的。对于矩阵，我们都知道纤维是行（仅改变第一个索引）或列（仅改变第二个索引）。这个概念推广到任何阶的张量（对于三阶张量，沿着第三模式的纤维也称为管）。

张量的n模展开是通过将纤维沿着张量的n阶模堆叠得到矩阵。展开的各种定义因这些纤维的顺序而异。

现在张量存储：元素作为一个长向量存储在内存中，从最后一个维度到第一个维度，反之亦然。这些被称为 row-major (or C) and column-major (or Fortran) 排序。当您在张量上使用 reshape 时，您通常会读取元素在内存中的组织形式。

最成熟的定义是由 Kolda 和 Bader 在他们关于张量分解的开创性工作中推广的。他们发表在 2009 年 SIAM REVIEW 上的论文 Tensor Decompositions and Applications 是一个很好的介绍。他们对展开的定义对应于具有元素 Fortran 排序的张量的 reshape。这是 Matlab 中的默认设置，他们在其中实现了他们的方法。

既然你提到你正在使用 Python，我假设你正在使用 NumPy 和元素的默认排序（即 C 排序）。您可以使用不同的展开定义来匹配该顺序，或者使用稍微复杂的函数来展开：

import numpy as np

def f_unfold(tensor, mode=0):
    """Unfolds a tensors following the Kolda and Bader definition

        Moves the `mode` axis to the beginning and reshapes in Fortran order
    """
    return np.reshape(np.moveaxis(tensor, mode, 0), 
                      (tensor.shape[mode], -1), order='F')

或者，正如我们在 TensorLy 中所做的那样，您可以使用与元素的 C 顺序相匹配的定义。

只要一致，使用哪个定义并不重要（尽管在某些情况下，不同的定义会导致略有不同的属性）。

最后，回到你的第一个问题，如果你有一个大小为 (N, M, P) 的张量表示为一个 numpy 数组，元素按 C 排序，那么 reshape(N, M*P) 给出您使用展开的 "TensorLy" 定义沿着该张量的第一模式展开。如果你想要"Kolda & Bader"版本的展开，你可以使用上面定义的f_unfold函数。

注意，不管你使用什么定义，如果你想沿着第n个模式展开，你必须先把这个模式放在开头，然后再整形（比如使用np.moveaxis(tensor, n, 0)）。

我写了一篇blog post关于张量展开的定义，如果你对细节感兴趣的话。

使用 numpy reshape 执行三阶张量展开操作

Using numpy reshape to perform 3rd rank tensor unfold operation

python

numpy

image-processing

linear-algebra

tensor