为什么我们需要重塑 LSTM 的输入？

Question

我读了这篇关于 LSTM 的文章：

https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/

第一个基本例子是关于"Vanilla LSTM"：预测下一个时间序列

其中输入=[10, 20, 30, 40, 50, 60, 70, 80, 90]

作者在文章中将输入（序列）拆分为矩阵：

X,              y
10, 20, 30      40
20, 30, 40      50
30, 40, 50      60
...

我不明白为什么输入需要重塑：

reshape from [samples, timesteps] into [samples, timesteps, features]

1.为什么我们需要这个？

此外，如果我的输入类似于（基本示例 + ID 列）：

ID    X,                y
1     10, 20, 30        40
1     20, 30, 40        50
2     30, 40, 50        60
2     40, 50, 60,       70
...

2。我们如何重塑它？我们将成为新的维度？

Answer 1

不确定 ID 的来源，但对于 Keras 中的 LSTM 网络，您需要输入是 3 维的。

最初你有二维矩阵作为输入，其中每一行都是一个时间戳所以
[samples, timesteps]。

但是由于输入预计是 3 维的，因此您将其重塑为 [samples, timesteps, 1]。这里 1 表示数据中的特征数或变量数。由于这是一个单变量时间序列（您只有 1 个变量的序列）n_features 是 1。

这可以通过 np_array.reshape(np_array.shape[0], np_array.shape[1], 1)

轻松完成

Answer 2

我认为 this link 会帮助您理解原因。

You always have to give a three-dimensional array as an input to your LSTM network. Where the first dimension represents the batch size, the second dimension represents the number of time-steps you are feeding a sequence. And the third dimension represents the number of units in one input sequence. For example, input shape looks like (batch_size, time_steps, seq_len)

让我们以您的示例序列为例：[10, 20, 30, 40, 50, 60, 70, 80, 90]

一旦我们按照您的文章中所述进行split_sequence，我们就会得到一个形状为 (6, 3) 的二维特征向量 X。其中 6 是样本数，3 是步骤数。

但考虑到模型仅采用 3-D 向量，我们必须将 2-d 向量重塑为 3-d。

所以从 (6, 3) --> (6, 3, 1).

要回答您的第二个问题，您可以通过执行以下操作简单地重塑您的二维特征向量 X：

# Given that X is a numpy array
samples = X.shape[0]
steps = X.shape[1]
X = X.reshape(samples, steps, 1)

Answer 3

LSTM 的三维特征输入输入可以被认为是（组数、每组中的时间步长、列数或变量类型）。例如 (100,10,1) 可以看作是 100 个组，每个组中有 10 行和 1 列。一列意味着只有一种类型的变量或一个 x。

为什么我们需要重塑 LSTM 的输入？

Why do we need to reshape the input for LSTM?

python

machine-learning

scikit-learn

deep-learning

lstm