具有整数序列的 Keras 示例词级模型给出“预期的 ndim=3,找到的 ndim=4”
Keras example word-level model with integer sequences gives `expected ndim=3, found ndim=4`
我正在尝试在 奖励部分 -> 下列出的 their blog 上实现 Keras 单词级示例使用具有整数序列的词级模型?
我已经用名称标记了层,以帮助我稍后将层从加载模型重新连接到推理模型。我想我已经遵循了他们的示例模型:
# Define an input sequence and process it - where the shape is (timesteps, n_features)
encoder_inputs = Input(shape=(None, src_vocab), name='enc_inputs')
# Add an embedding layer to process the integer encoded words to give some 'sense' before the LSTM layer
encoder_embedding = Embedding(src_vocab, latent_dim, name='enc_embedding')(encoder_inputs)
# The return_state constructor argument configures a RNN layer to return a list where the first entry is the outputs
# and the next entries are the internal RNN states. This is used to recover the states of the encoder.
encoder_outputs, state_h, state_c = LSTM(latent_dim, return_state=True, name='encoder_lstm')(encoder_embedding)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state of the RNN.
decoder_inputs = Input(shape=(None, target_vocab), name='dec_inputs')
decoder_embedding = Embedding(target_vocab, latent_dim, name='dec_embedding')(decoder_inputs)
# The return_sequences constructor argument, configuring a RNN to return its full sequence of outputs (instead of
# just the last output, which the defaults behavior).
decoder_lstm = LSTM(latent_dim, return_sequences=True, name='dec_lstm')(decoder_embedding, initial_state=encoder_states)
decoder_outputs = Dense(target_vocab, activation='softmax', name='dec_outputs')(decoder_lstm)
# Put the model together
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
但我得到
ValueError: Input 0 is incompatible with layer encoder_lstm: expected ndim=3, found ndim=4
上线
encoder_outputs, state_h, state_c = LSTM(...
我错过了什么?还是博客上的示例假设我跳过了一个步骤?
更新:
我正在训练:
X = [source_data, target_data]
y = offset_data(target_data)
model.fit(X, y, ...)
更新 2:
所以,我还没有到那儿。我像上面那样定义了 decoder_lstm
和 decoder_outputs
并修复了输入。当我从 h5
文件加载我的模型并构建我的推理模型时,我尝试使用
连接到训练 model
decoder_inputs = model.input[1] # dec_inputs (Input(shape=(None,)))
# decoder_embedding = model.layers[3] # dec_embedding (Embedding(target_vocab, latent_dim))
target_vocab = model.output_shape[2]
decoder_state_input_h = Input(shape=(latent_dim,), name='input_3') # named to avoid conflict
decoder_state_input_c = Input(shape=(latent_dim,), name='input_4')
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
# Use decoder_lstm from the training model
# decoder_lstm = LSTM(latent_dim, return_sequences=True)
decoder_lstm = model.layers[5] # dec_lstm
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
但是我得到一个错误
ValueError: Input 0 is incompatible with layer dec_lstm: expected ndim=3, found ndim=2
尝试通过 decoder_embedding
而不是 decoder_inputs
也失败了。
我正在尝试调整 lstm_seq2seq_restore.py 的示例,但它不包括嵌入层的复杂性。
更新 3:
当我使用 decoder_outputs, state_h, state_c = decoder_lstm(decoder_embedding, ...)
构建推理模型时,我已确认 decoder_embedding
是类型 Embedding
的对象,但我得到:
ValueError: Layer dec_lstm was called with an input that isn't a symbolic tensor. Received type: <class 'keras.layers.embeddings.Embedding'>. Full input: [<keras.layers.embeddings.Embedding object at 0x1a1f22eac8>, <tf.Tensor 'input_3:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'input_4:0' shape=(?, 256) dtype=float32>]. All inputs to the layer should be tensors.
此模型的完整代码在 Bitbucket。
问题出在Input
层的输入形状上。嵌入层接受一个整数序列作为输入,对应于句子中的单词索引。由于这里句子的词数不固定,所以必须将Input
层的输入shape设置为(None,)
。
我认为您误以为我们的模型中没有嵌入层,因此模型的输入形状是 (timesteps, n_features)
以使其与 LSTM 层兼容。
更新:
您需要先将 decoder_inputs
传递给嵌入层,然后将生成的输出张量传递给 decoder_lstm
层,如下所示:
decoder_inputs = model.input[1] # (Input(shape=(None,)))
# pass the inputs to the embedding layer
decoder_embedding = model.get_layer(name='dec_embedding')(decoder_inputs)
# ...
decoder_lstm = model.get_layer(name='dec_lstm') # dec_lstm
decoder_outputs, state_h, state_c = decoder_lstm(decoder_embedding, ...)
更新二:
在训练时,创建decoder_lstm
层时需要设置return_state=True
:
decoder_lstm, _, _ = LSTM(latent_dim, return_sequences=True, return_state=True, name='dec_lstm')(decoder_embedding, initial_state=encoder_states)
我正在尝试在 奖励部分 -> 下列出的 their blog 上实现 Keras 单词级示例使用具有整数序列的词级模型?
我已经用名称标记了层,以帮助我稍后将层从加载模型重新连接到推理模型。我想我已经遵循了他们的示例模型:
# Define an input sequence and process it - where the shape is (timesteps, n_features)
encoder_inputs = Input(shape=(None, src_vocab), name='enc_inputs')
# Add an embedding layer to process the integer encoded words to give some 'sense' before the LSTM layer
encoder_embedding = Embedding(src_vocab, latent_dim, name='enc_embedding')(encoder_inputs)
# The return_state constructor argument configures a RNN layer to return a list where the first entry is the outputs
# and the next entries are the internal RNN states. This is used to recover the states of the encoder.
encoder_outputs, state_h, state_c = LSTM(latent_dim, return_state=True, name='encoder_lstm')(encoder_embedding)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state of the RNN.
decoder_inputs = Input(shape=(None, target_vocab), name='dec_inputs')
decoder_embedding = Embedding(target_vocab, latent_dim, name='dec_embedding')(decoder_inputs)
# The return_sequences constructor argument, configuring a RNN to return its full sequence of outputs (instead of
# just the last output, which the defaults behavior).
decoder_lstm = LSTM(latent_dim, return_sequences=True, name='dec_lstm')(decoder_embedding, initial_state=encoder_states)
decoder_outputs = Dense(target_vocab, activation='softmax', name='dec_outputs')(decoder_lstm)
# Put the model together
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
但我得到
ValueError: Input 0 is incompatible with layer encoder_lstm: expected ndim=3, found ndim=4
上线
encoder_outputs, state_h, state_c = LSTM(...
我错过了什么?还是博客上的示例假设我跳过了一个步骤?
更新:
我正在训练:
X = [source_data, target_data]
y = offset_data(target_data)
model.fit(X, y, ...)
更新 2:
所以,我还没有到那儿。我像上面那样定义了 decoder_lstm
和 decoder_outputs
并修复了输入。当我从 h5
文件加载我的模型并构建我的推理模型时,我尝试使用
model
decoder_inputs = model.input[1] # dec_inputs (Input(shape=(None,)))
# decoder_embedding = model.layers[3] # dec_embedding (Embedding(target_vocab, latent_dim))
target_vocab = model.output_shape[2]
decoder_state_input_h = Input(shape=(latent_dim,), name='input_3') # named to avoid conflict
decoder_state_input_c = Input(shape=(latent_dim,), name='input_4')
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
# Use decoder_lstm from the training model
# decoder_lstm = LSTM(latent_dim, return_sequences=True)
decoder_lstm = model.layers[5] # dec_lstm
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
但是我得到一个错误
ValueError: Input 0 is incompatible with layer dec_lstm: expected ndim=3, found ndim=2
尝试通过 decoder_embedding
而不是 decoder_inputs
也失败了。
我正在尝试调整 lstm_seq2seq_restore.py 的示例,但它不包括嵌入层的复杂性。
更新 3:
当我使用 decoder_outputs, state_h, state_c = decoder_lstm(decoder_embedding, ...)
构建推理模型时,我已确认 decoder_embedding
是类型 Embedding
的对象,但我得到:
ValueError: Layer dec_lstm was called with an input that isn't a symbolic tensor. Received type: <class 'keras.layers.embeddings.Embedding'>. Full input: [<keras.layers.embeddings.Embedding object at 0x1a1f22eac8>, <tf.Tensor 'input_3:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'input_4:0' shape=(?, 256) dtype=float32>]. All inputs to the layer should be tensors.
此模型的完整代码在 Bitbucket。
问题出在Input
层的输入形状上。嵌入层接受一个整数序列作为输入,对应于句子中的单词索引。由于这里句子的词数不固定,所以必须将Input
层的输入shape设置为(None,)
。
我认为您误以为我们的模型中没有嵌入层,因此模型的输入形状是 (timesteps, n_features)
以使其与 LSTM 层兼容。
更新:
您需要先将 decoder_inputs
传递给嵌入层,然后将生成的输出张量传递给 decoder_lstm
层,如下所示:
decoder_inputs = model.input[1] # (Input(shape=(None,)))
# pass the inputs to the embedding layer
decoder_embedding = model.get_layer(name='dec_embedding')(decoder_inputs)
# ...
decoder_lstm = model.get_layer(name='dec_lstm') # dec_lstm
decoder_outputs, state_h, state_c = decoder_lstm(decoder_embedding, ...)
更新二:
在训练时,创建decoder_lstm
层时需要设置return_state=True
:
decoder_lstm, _, _ = LSTM(latent_dim, return_sequences=True, return_state=True, name='dec_lstm')(decoder_embedding, initial_state=encoder_states)