用于多输入案例的 Keras TimeDistributed?
Keras TimeDistributed for multi-input case?
我们模型的描述
在我们的模型中,我想把low_level_model
时间分配给LSTM
上层,做一个层级模型。 low_level_model
通过聚合区域序列及其 visit_id 的结果找到客户访问的隐藏表示。每个区域序列都经过 CNN 和注意力层,结果与每次访问的嵌入向量连接。
据我所知,TimeDistributed
包装器可用于制作分层模型,因此我尝试使用两个不同的输入来包装我们的 low_level_model
。但似乎图书馆不支持多输入案例。这是我们的代码。
# Get 1st input
visit_input = keras.Input((1,))
visit_emb = visit_embedding_layer(visit_input)
visit_output = Reshape((-1,))(visit_emb)
# Get 2nd input - Shallow model
areas_input = keras.Input((10,))
areas_emb = area_embedding_layer(areas_input)
areas_cnn = Conv1D(filters=200, kernel_size=5,
padding='same', activation='relu', strides=1)(areas_emb)
areas_output = simple_attention(areas_cnn, areas_cnn)
# Concat two results from 1st and 2nd input
v_a_emb_concat = Concatenate()([visit_output, areas_output])
# Define this model as low_level_model
low_level_model = keras.Model(inputs=[areas_input, visit_input], outputs=v_a_emb_concat)
# Would like to use the result of this low_level_model as inputs for higher-level LSTM layer.
# Therefore, wrap this model by TimeDistributed layer
encoder = TimeDistributed(low_level_model)
# New input with step-size 5 (Consider 5 as the number of previous data)
all_visit_input = keras.Input((5, 1))
all_areas_input = keras.Input((5, 10))
# This part raises AssertionError (assert len(input_shape) >= 3)
all_areas_rslt = encoder(inputs=[all_visit_input, all_areas_input])
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)
# Model define (Multi-input ISSUE HERE!)
self.model = keras.Model(inputs=[all_visit_input, all_areas_input], outputs=logits)
self.model.compile(optimizer=keras.optimizers.Adam(0.001),
loss=custom_loss_function)
# Get data
self.train_data = data.train_data_generator_hist()
self.test_data = data.test_data_generator_hist()
# Fit
self.history = self.model.fit_generator(
generator=self.train_data,
steps_per_epoch=train_data_size//FLAGS.batch_size,
epochs=FLAGS.train_epochs]
)
错误信息
报错信息如下
File "/home/dmlab/sundong/revisit/survival-revisit-code/survrev.py", line 163, in train_test
all_areas_rslt = encoder(inputs=[all_visit_input, all_areas_input])
File "/home/dmlab/ksedm1/anaconda3/envs/py36/lib/python3.6/site-packages/keras/engine/base_layer.py", line 431, in __call__
self.build(unpack_singleton(input_shapes))
File "/home/dmlab/ksedm1/anaconda3/envs/py36/lib/python3.6/site-packages/keras/layers/wrappers.py", line 195, in build
assert len(input_shape) >= 3
AssertionError
我试过的
1) 我读了这个 keras issue 但无法清楚地弄清楚如何使用技巧来转发多个输入。
2) 我检查了带有 TimeDistribute
的代码在我只使用单一输入时是否有效(例如 areas_input
)。修改后的代码示例如下。
3) 现在尝试跟进[上一个问题]。 ()
# Using only one input
areas_input = keras.Input((10,))
areas_emb = area_embedding_layer(areas_input)
areas_cnn = Conv1D(filters=200, kernel_size=5,
padding='same', activation='relu', strides=1)(areas_emb)
areas_output = simple_attention(areas_cnn, areas_cnn)
# Define this model as low_level_model
low_level_model = keras.Model(inputs=areas_input, outputs=areas_output)
# Would like to use the result of this low_level_model as inputs for higher-level LSTM layer.
# Therefore, wrap this model by TimeDistributed layer
encoder = TimeDistributed(low_level_model)
# New input with step-size 5 (Consider 5 as the number of previous data)
all_areas_input = keras.Input((5, 10))
# No Error
all_areas_rslt = encoder(inputs=all_areas_input)
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)
# Model define (Multi-input ISSUE HERE!)
self.model = keras.Model(inputs=all_areas_input, outputs=logits)
self.model.compile(optimizer=keras.optimizers.Adam(0.001),
loss=custom_loss_function)
# Get data
self.train_data = data.train_data_generator_hist()
self.test_data = data.test_data_generator_hist()
# Fit
self.history = self.model.fit_generator(
generator=self.train_data,
steps_per_epoch=train_data_size//FLAGS.batch_size,
epochs=FLAGS.train_epochs]
)
提前感谢您分享解决此问题的技术。
总而言之,我通过完全获取输入并使用 Lambda
层划分这些输入来解决这个问题。 TimeDistributed
只能接受单个输入,这就是原因。这是我的代码片段。
single_input = keras.Input((1+10),))
visit_input = Lambda(lambda x: x[:, 0:1])(single_input)
areas_input = Lambda(lambda x: x[:, 1: ])(single_input)
...
low_level_model = keras.Model(inputs=single_input, outputs=concat)
encoder = TimeDistributed(low_level_model)
multiple_inputs = keras.Input((5, 11)))
all_areas_rslt = encoder(inputs=multiple_inputs)
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)
我收到了相同的错误消息,并将其追溯到与原始发帖人相同的问题和 Github 问题。通过使用 keras.layers.RepeatVector
,我能够使用 TimeDistributed
输出层解决多个输入的问题。下面是我的例子:
core_input_1 = Input(shape=(self.core_timesteps, self.core_input_1_dim), name='core_input_1')
core_branch_1 = BatchNormalization(momentum=0.0, name='core_1_bn')(core_input_1)
core_branch_1 = LSTM(self.core_nodes[0], activation='relu', name='core_1_lstm_1', return_sequences=True)(core_branch_1)
core_branch_1 = LSTM(self.core_nodes[1], activation='relu', name='core_1_lstm_2')(core_branch_1)
core_input_2 = Input(shape=(self.core_timesteps, self.core_input_2_dim), name='core_input_2')
core_branch_2 = BatchNormalization(momentum=0.0, name='core_2_bn')(core_input_2)
core_branch_2 = LSTM(self.core_nodes[0], activation='relu', name='core_2_lstm_1', return_sequences=True)(core_branch_2)
core_branch_2 = LSTM(self.core_nodes[1], activation='relu', name='core_2_lstm_2')(core_branch_2)
merged = Concatenate()([core_branch_1, core_branch_2])
full_branch = RepeatVector(self.output_timesteps)(merged)
full_branch = LSTM(self.core_nodes[1], activation='relu', name='final_lstm', return_sequences=True)(full_branch)
full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense', activation='relu'))(full_branch)
full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense'))(full_branch)
model = Model(inputs=[core_input_1, core_input_2], outputs=full_branch, name='full_model')
我发布了完整的示例,以便其他人可以看到可行的方法,但解决方案的关键部分如下:
return_sequences = False
在 Concat 之前的层中。
- 如果在连接后使用 LSTM 层,
return_sequences = True
来提供 TimeDistributed
层。
RepeatVector
层参数必须与TimeDistributed
层输出中的时间步数相同。
我只在与我的问题相关的架构中测试过这个解决方案,所以我不确定它在 TimeDistributed
层的其他用例中的局限性。但这是一个很好的解决方案,我在任何讨论这个问题的帖子中都找不到。
我们模型的描述
在我们的模型中,我想把low_level_model
时间分配给LSTM
上层,做一个层级模型。 low_level_model
通过聚合区域序列及其 visit_id 的结果找到客户访问的隐藏表示。每个区域序列都经过 CNN 和注意力层,结果与每次访问的嵌入向量连接。
据我所知,TimeDistributed
包装器可用于制作分层模型,因此我尝试使用两个不同的输入来包装我们的 low_level_model
。但似乎图书馆不支持多输入案例。这是我们的代码。
# Get 1st input
visit_input = keras.Input((1,))
visit_emb = visit_embedding_layer(visit_input)
visit_output = Reshape((-1,))(visit_emb)
# Get 2nd input - Shallow model
areas_input = keras.Input((10,))
areas_emb = area_embedding_layer(areas_input)
areas_cnn = Conv1D(filters=200, kernel_size=5,
padding='same', activation='relu', strides=1)(areas_emb)
areas_output = simple_attention(areas_cnn, areas_cnn)
# Concat two results from 1st and 2nd input
v_a_emb_concat = Concatenate()([visit_output, areas_output])
# Define this model as low_level_model
low_level_model = keras.Model(inputs=[areas_input, visit_input], outputs=v_a_emb_concat)
# Would like to use the result of this low_level_model as inputs for higher-level LSTM layer.
# Therefore, wrap this model by TimeDistributed layer
encoder = TimeDistributed(low_level_model)
# New input with step-size 5 (Consider 5 as the number of previous data)
all_visit_input = keras.Input((5, 1))
all_areas_input = keras.Input((5, 10))
# This part raises AssertionError (assert len(input_shape) >= 3)
all_areas_rslt = encoder(inputs=[all_visit_input, all_areas_input])
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)
# Model define (Multi-input ISSUE HERE!)
self.model = keras.Model(inputs=[all_visit_input, all_areas_input], outputs=logits)
self.model.compile(optimizer=keras.optimizers.Adam(0.001),
loss=custom_loss_function)
# Get data
self.train_data = data.train_data_generator_hist()
self.test_data = data.test_data_generator_hist()
# Fit
self.history = self.model.fit_generator(
generator=self.train_data,
steps_per_epoch=train_data_size//FLAGS.batch_size,
epochs=FLAGS.train_epochs]
)
错误信息
报错信息如下
File "/home/dmlab/sundong/revisit/survival-revisit-code/survrev.py", line 163, in train_test
all_areas_rslt = encoder(inputs=[all_visit_input, all_areas_input])
File "/home/dmlab/ksedm1/anaconda3/envs/py36/lib/python3.6/site-packages/keras/engine/base_layer.py", line 431, in __call__
self.build(unpack_singleton(input_shapes))
File "/home/dmlab/ksedm1/anaconda3/envs/py36/lib/python3.6/site-packages/keras/layers/wrappers.py", line 195, in build
assert len(input_shape) >= 3
AssertionError
我试过的
1) 我读了这个 keras issue 但无法清楚地弄清楚如何使用技巧来转发多个输入。
2) 我检查了带有 TimeDistribute
的代码在我只使用单一输入时是否有效(例如 areas_input
)。修改后的代码示例如下。
3) 现在尝试跟进[上一个问题]。 (
# Using only one input
areas_input = keras.Input((10,))
areas_emb = area_embedding_layer(areas_input)
areas_cnn = Conv1D(filters=200, kernel_size=5,
padding='same', activation='relu', strides=1)(areas_emb)
areas_output = simple_attention(areas_cnn, areas_cnn)
# Define this model as low_level_model
low_level_model = keras.Model(inputs=areas_input, outputs=areas_output)
# Would like to use the result of this low_level_model as inputs for higher-level LSTM layer.
# Therefore, wrap this model by TimeDistributed layer
encoder = TimeDistributed(low_level_model)
# New input with step-size 5 (Consider 5 as the number of previous data)
all_areas_input = keras.Input((5, 10))
# No Error
all_areas_rslt = encoder(inputs=all_areas_input)
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)
# Model define (Multi-input ISSUE HERE!)
self.model = keras.Model(inputs=all_areas_input, outputs=logits)
self.model.compile(optimizer=keras.optimizers.Adam(0.001),
loss=custom_loss_function)
# Get data
self.train_data = data.train_data_generator_hist()
self.test_data = data.test_data_generator_hist()
# Fit
self.history = self.model.fit_generator(
generator=self.train_data,
steps_per_epoch=train_data_size//FLAGS.batch_size,
epochs=FLAGS.train_epochs]
)
提前感谢您分享解决此问题的技术。
总而言之,我通过完全获取输入并使用 Lambda
层划分这些输入来解决这个问题。 TimeDistributed
只能接受单个输入,这就是原因。这是我的代码片段。
single_input = keras.Input((1+10),))
visit_input = Lambda(lambda x: x[:, 0:1])(single_input)
areas_input = Lambda(lambda x: x[:, 1: ])(single_input)
...
low_level_model = keras.Model(inputs=single_input, outputs=concat)
encoder = TimeDistributed(low_level_model)
multiple_inputs = keras.Input((5, 11)))
all_areas_rslt = encoder(inputs=multiple_inputs)
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)
我收到了相同的错误消息,并将其追溯到与原始发帖人相同的问题和 Github 问题。通过使用 keras.layers.RepeatVector
,我能够使用 TimeDistributed
输出层解决多个输入的问题。下面是我的例子:
core_input_1 = Input(shape=(self.core_timesteps, self.core_input_1_dim), name='core_input_1')
core_branch_1 = BatchNormalization(momentum=0.0, name='core_1_bn')(core_input_1)
core_branch_1 = LSTM(self.core_nodes[0], activation='relu', name='core_1_lstm_1', return_sequences=True)(core_branch_1)
core_branch_1 = LSTM(self.core_nodes[1], activation='relu', name='core_1_lstm_2')(core_branch_1)
core_input_2 = Input(shape=(self.core_timesteps, self.core_input_2_dim), name='core_input_2')
core_branch_2 = BatchNormalization(momentum=0.0, name='core_2_bn')(core_input_2)
core_branch_2 = LSTM(self.core_nodes[0], activation='relu', name='core_2_lstm_1', return_sequences=True)(core_branch_2)
core_branch_2 = LSTM(self.core_nodes[1], activation='relu', name='core_2_lstm_2')(core_branch_2)
merged = Concatenate()([core_branch_1, core_branch_2])
full_branch = RepeatVector(self.output_timesteps)(merged)
full_branch = LSTM(self.core_nodes[1], activation='relu', name='final_lstm', return_sequences=True)(full_branch)
full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense', activation='relu'))(full_branch)
full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense'))(full_branch)
model = Model(inputs=[core_input_1, core_input_2], outputs=full_branch, name='full_model')
我发布了完整的示例,以便其他人可以看到可行的方法,但解决方案的关键部分如下:
return_sequences = False
在 Concat 之前的层中。- 如果在连接后使用 LSTM 层,
return_sequences = True
来提供TimeDistributed
层。 RepeatVector
层参数必须与TimeDistributed
层输出中的时间步数相同。
我只在与我的问题相关的架构中测试过这个解决方案,所以我不确定它在 TimeDistributed
层的其他用例中的局限性。但这是一个很好的解决方案,我在任何讨论这个问题的帖子中都找不到。