在 Keras 中构建 CNN + LSTM 以解决回归问题。什么是合适的形状?

Building CNN + LSTM in Keras for a regression problem. What are proper shapes?

我正在研究一个回归问题,我将一组频谱图提供给 CNN + LSTM - keras 中的架构。我的数据形状为 (n_samples, width, height, n_channels)。我的问题是如何将 CNN 正确连接到 LSTM 层。当卷积传递给 LSTM 时,数据需要以某种方式重塑。有几个想法,例如将 TimeDistributed-wrapper 与重塑结合使用,但我无法使其发挥作用。 .

height = 256
width = 256
n_channels = 3
seq_length = 1 #?

我从这个网络开始:

i = Input(shape=(width, height, n_channels))
    conv1 = Conv2D(filters=32,
                   activation='relu',
                   kernel_size=(2, 2),
                   padding='same')(i)
    lstm1 = LSTM(units=128,
                 activation='tanh',
                 return_sequences=False)(conv1)
    o = Dense(1)(lstm1)

我得到一个错误是:

ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 256, 256, 32]

我在这里找到了 suggesting to reshape. Below is an example of how I applied the information given in 。它需要添加 TimeDistributed-Wrapper。

i = Input(shape=(seq_length, width, height, n_channels))
conv1 = TimeDistributed(Conv2D(filters=32,
               activation='relu',
               kernel_size=(2, 2),
               padding='same'))(i)
conv1 = Reshape((seq_length, height*width*n_channels))(conv1)
lstm1 = LSTM(units=128,
             activation='tanh',
             return_sequences=False)(conv1)
o = Dense(1)(lstm1)

这导致:

ValueError: Error when checking input: expected input_1 to have 5 dimensions, but got array with shape (5127, 256, 256, 3)

但是,在上述 SO 的示例中,网络是针对视频序列进行训练的,因此需要 TimeDistributed(?)。就我而言,我有一组源自信号的频谱图,我没有训练视频。所以,一个想法是将 time_steps 添加到 1 来克服这个问题。做了类似的事情 。输入层则为:

Input(shape=(seq_length, width, height, n_channels))

导致整形操作出错。

ValueError: total size of new array must be unchanged

如果能帮助我正确连接 CNN + LSTM 层,我将不胜感激。谢谢!

一种可能的解决方案是将 LSTM 输入设置为 (num_pixels, cnn_features) 形状。在您的特定情况下,有一个带有 32 个过滤器的 cnn,LSTM 将收到 (256*256, 32)

cnn_features = 32

inp = tf.keras.layers.Input(shape=(256, 256, 3))
x = tf.keras.layers.Conv2D(filters=cnn_features,
                   activation='relu',
                   kernel_size=(2, 2),
                   padding='same')(inp)
x = tf.keras.layers.Reshape((256*256, cnn_features))(x)
x = tf.keras.layers.LSTM(units=128,
        activation='tanh',
        return_sequences=False)(x)
out = tf.keras.layers.Dense(1)(x)