LSTM：层顺序的输入0与层不兼容

Question

我知道这里有几个关于此的问题，但我还没有找到完全符合我的问题的问题。我正在尝试使用来自 Pandas DataFrames 的数据来拟合 LSTM，但对我必须提供它们的格式感到困惑。我创建了一个小代码片段，它将向您展示我尝试做的事情：

import pandas as pd, tensorflow as tf, random
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

targets = pd.DataFrame(index=pd.date_range(start='2019-01-01', periods=300, freq='D'))
targets['A'] = [random.random() for _ in range(len(targets))]
targets['B'] = [random.random() for _ in range(len(targets))] 
features = pd.DataFrame(index=targets.index)
for i in range(len(features)) :
    features[str(i)] = [random.random() for _ in range(len(features))] 

model = Sequential()
model.add(LSTM(units=targets.shape[1], input_shape=features.shape))
model.compile(optimizer='adam', loss='mae')

model.fit(features, targets, batch_size=10, epochs=10)

结果为：

ValueError：顺序层的输入 0 与层不兼容：预期 ndim=3，发现 ndim=2。已收到完整形状：[10, 300]

我预计这与所提供的 features DataFrame 的尺寸有关。我想一旦解决了这个问题，下一个错误就会提到 targets DataFrame.

据我了解，我第一层的 'units' 参数定义了该模型的输出维度。输入必须具有 3D 形状，但我不知道如何从数据框的 2D 世界中创建它们。我希望你能帮助我理解 Python 中的重塑机制以及如何将它们与 Pandas DataFrames 结合使用。（我是 Python 的新手，来自 R）

提前致谢

Answer 1

LSTM 的输入数据必须是 3D。

如果打印 DataFrame 的形状，您会得到：

targets : (300, 2)
features : (300, 300)

输入数据必须重新整形为 (samples, time steps, features)。这意味着 targets 和 features 必须具有相同的形状。

您需要为您的问题设置时间步数，换句话说，将使用多少样本进行预测。

例如，如果你有 300 天和 2 个特征，时间步长 可以是 3。这样三天将用于进行一次预测（你可以任意选择这个).这是重塑数据的代码（还有一些更改）：

import pandas as pd
import numpy as np
import tensorflow as tf
import random
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

data = pd.DataFrame(index=pd.date_range(start='2019-01-01', periods=300, freq='D'))
data['A'] = [random.random() for _ in range(len(data))]
data['B'] = [random.random() for _ in range(len(data))]

# Choose the time_step size.
time_steps = 3
# Use numpy for the 3D array as it is easier to handle.
data = np.array(data)

def make_x_y(ts, data):
    """
    Parameters
    ts : int
    data : numpy array

    This function creates two arrays, x and y. 
    x is the input data and y is the target data.
    """
    x, y = [], []
    offset = 0
    for i in data:
        if offset < len(data)-ts:
            x.append(data[offset:ts+offset])
            y.append(data[ts+offset])
            offset += 1
    return np.array(x), np.array(y)

x, y = make_x_y(time_steps, data)

print(x.shape, y.shape)

nodes = 100  # This is the width of the network.
out_size = 2  # Number of outputs produced by the network. Same size as features.

model = Sequential()
model.add(LSTM(units=nodes, input_shape=(x.shape[1], x.shape[2])))
model.add(Dense(out_size))  # For the output a Dense (fully connected) layer is used.
model.compile(optimizer='adam', loss='mae')
model.fit(x, y, batch_size=10, epochs=10)

Answer 2

让我们看看 LSTMs 中使用的几种流行方法。

多对多

示例：您有一个句子（由按顺序排列的单词组成）。给出这些单词序列，你想预测每个单词的词性 (POS)。

所以你有 n 个单词，你将每个时间步长的每个单词都输入到 LSTM。每个 LSTM 时间步长（也称为 LSTM 展开）将产生并输出。该词由一组通常是词嵌入的特征表示。所以 LSTM 的输入大小为 bath_size X time_steps X features

Keras 代码：

inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = True)(inputs)
outputs = keras.layers.TimeDistributed(keras.layers.Dense(5, activation='softmax'))(lstm)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')

X = np.random.randn(4,10,3) 
y = np.random.randint(0,2, size=(4,10,5))

model.fit(X, y, epochs=2)
print (model.predict(X).shape)

多对一

示例：同样，您有一个句子（由按顺序排列的单词组成）。给出这些单词序列，你想预测句子的情绪是正面的还是负面的。

Keras 代码

inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = False)(inputs)
outputs =keras.layers.Dense(5, activation='softmax')(lstm)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')

X = np.random.randn(4,10,3) 
y = np.random.randint(0,2, size=(4,5))

model.fit(X, y, epochs=2)
print (model.predict(X).shape)

多头到多头

示例：您有一个句子（由按顺序排列的单词组成）。给出这些单词序列，你想预测句子的情绪以及句子的作者。

这是一个多头模型，其中一个头将预测情绪，另一个头将预测作者。两个头共享相同的 LSTM backbone.

Keras 代码

inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = False)(inputs)
output_A = keras.layers.Dense(5, activation='softmax')(lstm)
output_B = keras.layers.Dense(5, activation='softmax')(lstm)

model = keras.Model(inputs=inputs, outputs=[output_A, output_B])
model.compile(loss='categorical_crossentropy', optimizer='adam')

X = np.random.randn(4,10,3) 
y_A = np.random.randint(0,2, size=(4,5))
y_B = np.random.randint(0,2, size=(4,5))

model.fit(X, [y_A, y_B], epochs=2)
y_hat_A, y_hat_B = model.predict(X)
print (y_hat_A.shape, y_hat_B.shape)

您正在寻找的是多对多头模型，其中您对 A 的预测将由一个头做出，而另一个头将对 B

做出预测

Answer 3

好吧，为了解决这个问题，我想提供一个我同时致力于解决的解决方案。 tf.keras 中的 class TimeseriesGenerator 使我能够很容易地将正确形状的数据提供给 LSTM 模型

from keras.preprocessing.sequence import TimeseriesGenerator
import numpy as np

window_size   = 7
batch_size    = 8
sampling_rate = 1

train_gen = TimeseriesGenerator(X_train.values, y_train.values,
                               length=window_size, sampling_rate=sampling_rate,
                               batch_size=batch_size)

valid_gen = TimeseriesGenerator(X_valid.values, y_valid.values,
                               length=window_size, sampling_rate=sampling_rate,
                               batch_size=batch_size)
test_gen  = TimeseriesGenerator(X_test.values, y_test.values,
                               length=window_size, sampling_rate=sampling_rate,
                               batch_size=batch_size)

还有许多其他方法可以实现生成器，例如使用提供函数 windowed 的 more_itertools，或利用 tensorflow.Dataset 及其函数 window 。对我来说，TimeseriesGenerator 足以满足我所做的测试。如果您想查看基于某些股票对 DAX 建模的示例，我将在 Github.

上分享 notebook

LSTM：层顺序的输入0与层不兼容

LSTM: Input 0 of layer sequential is incompatible with the layer

python

pandas

lstm

keras

tensorflow

多对多

Keras 代码：

多对一

Keras 代码

多头到多头

Keras 代码