使用 Tensorflow 格式化具有可变时间步长的 LSTM 层的输入

Question

根据文档，LSTM 层应处理具有 (None, CONST, CONST) 形状的输入。对于 可变时间步 ，它应该能够处理具有 (None, None, CONST) shape.[ 的输入。 =15=]

假设我的数据如下：

X = [
    [
        [1, 2, 3],
        [4, 5, 6]
    ],
    [
        [7, 8, 9]
    ]
]
Y = [0, 1]

还有我的模特：

model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(32, activation='tanh',input_shape=(None, 3)),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X, Y)

我的问题是：我应该如何设置这些输入的格式才能使此代码正常工作？

我不能像以前那样在这里使用 pandas 数据帧。如果我运行上面的代码，我得到这个错误：

Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 2 arrays:

如果我将最后一行更改为：

model.fit(np.array(X), np.array(Y))

现在的错误是：

Error when checking input: expected lstm_8_input to have 3 dimensions, but got array with shape (2, 1)

Answer 1

你很接近，但在 Keras/Tensorflow 中，你需要 pad 你的序列，然后使用 Masking 让 LSTM 跳过那些填充的序列。 为什么？ 因为张量中的条目需要具有相同的形状 (batch_size, max_length, features)。所以如果你有可变长度，序列会被填充。

您可以使用 keras.preprocessing.sequence.pad_sequences 填充您的序列以获得如下内容：

X = [
    [
        [1, 2, 3],
        [4, 5, 6]
    ],
    [
        [7, 8, 9],
        [0, 0, 0],
    ]
]
X.shape == (2, 2, 3)
Y = [0, 1]
Y.shape == (2, 1)

然后使用遮罩层：

model = tf.keras.models.Sequential([
    tf.keras.layers.Masking(), # this tells LSTM to skip certain timesteps
    tf.keras.layers.LSTM(32, activation='tanh',input_shape=(None, 3)),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(X, Y)

您还需要 binary_crossentropy，因为您有 sigmoid 输出的二元分类问题。

使用 Tensorflow 格式化具有可变时间步长的 LSTM 层的输入

Formatting inputs for LSTM layer with variable timestep using Tensorflow

python

neural-network

keras

tensorflow

recurrent-neural-network