带有 TimeDistributed 层的 CNN-LSTM 在尝试使用 tf.keras.utils.plot_model 时表现异常
CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model
我有一个 CNN-LSTM,如下所示;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
我将这个 CNN-LSTM 用于多变量时间序列预测问题。 CNN-LSTM 输入数据采用 4D 格式:[样本、子序列、时间步长、特征]。出于某种原因,我需要 TimeDistributed
层;或者我得到像 ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
这样的错误。我认为这与 Conv1D
正式不适用于时间序列这一事实有关,因此为了保留时间序列数据形状,我们需要使用像 TimeDistributed
这样的包装层。我真的不介意使用 TimeDistributed 层 - 它们是包装器,如果它们让我的模型工作我很高兴。但是,当我尝试使用
可视化我的模型时
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
生成的可视化仅显示 Sequential()
:
我怀疑这与 TimeDistributed 层和尚未构建的模型有关。我也不能调用 model.summary()
- 它抛出 ValueError: This model has not yet been built. Build the model first by calling
build()or calling
fit()with some data, or specify an
input_shape argument in the first layer(s) for automatic build
这很奇怪,因为我 指定了 input_shape,尽管是在 Conv1D
层中而不是在 TimeDistributed
包装器中。
我想要一个工作模型和一个工作 tf.keras.utils.plot_model
函数。关于为什么我需要 TimeDistributed 以及为什么它使 plot_model 函数行为怪异的任何解释都会非常棒。
在开头添加您的输入层。试试这个
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
现在,您可以正确绘制和获取摘要了。
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK
使用 Input
层的替代方法是简单地将 input_shape
传递给 TimeDistributed
包装器,而不是 Conv1D
层:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
我有一个 CNN-LSTM,如下所示;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
我将这个 CNN-LSTM 用于多变量时间序列预测问题。 CNN-LSTM 输入数据采用 4D 格式:[样本、子序列、时间步长、特征]。出于某种原因,我需要 TimeDistributed
层;或者我得到像 ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
这样的错误。我认为这与 Conv1D
正式不适用于时间序列这一事实有关,因此为了保留时间序列数据形状,我们需要使用像 TimeDistributed
这样的包装层。我真的不介意使用 TimeDistributed 层 - 它们是包装器,如果它们让我的模型工作我很高兴。但是,当我尝试使用
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
生成的可视化仅显示 Sequential()
:
我怀疑这与 TimeDistributed 层和尚未构建的模型有关。我也不能调用 model.summary()
- 它抛出 ValueError: This model has not yet been built. Build the model first by calling
build()or calling
fit()with some data, or specify an
input_shape argument in the first layer(s) for automatic build
这很奇怪,因为我 指定了 input_shape,尽管是在 Conv1D
层中而不是在 TimeDistributed
包装器中。
我想要一个工作模型和一个工作 tf.keras.utils.plot_model
函数。关于为什么我需要 TimeDistributed 以及为什么它使 plot_model 函数行为怪异的任何解释都会非常棒。
在开头添加您的输入层。试试这个
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
现在,您可以正确绘制和获取摘要了。
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK
使用 Input
层的替代方法是简单地将 input_shape
传递给 TimeDistributed
包装器,而不是 Conv1D
层:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model