在 tensorflow 1.0 中将单向 LSTM 单元转换为双向 LSTM 单元

Convert unidirectional LSTM cell to Bidirectinoal LSTM cell in tensorflow 1.0

我有在 tensorflow 1.0.1 中实现的遗留代码。 我想将当前的 LSTM 单元转换为双向 LSTM。

with tf.variable_scope("encoder_scope") as encoder_scope:

cell = contrib_rnn.LSTMCell(num_units=state_size, state_is_tuple=True)
cell = DtypeDropoutWrapper(cell=cell, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
cell = contrib_rnn.MultiRNNCell(cells=[cell] * num_lstm_layers, state_is_tuple=True)

encoder_cell = cell

encoder_outputs, last_encoder_state = tf.nn.dynamic_rnn(
    cell=encoder_cell,
    dtype=DTYPE,
    sequence_length=encoder_sequence_lengths,
    inputs=encoder_inputs,
    )

我在那里找到了一些例子。 https://riptutorial.com/tensorflow/example/17004/creating-a-bidirectional-lstm

但是我无法通过引用将我的 LSTM 单元转换为双向 LSTM 单元。 在我的情况下,state_below 应该放什么?

更新:除了上述问题,我还需要阐明如何将以下解码器网络 (dynamic_rnn_decoder) 转换为使用双向 LSTM。 (文档没有提供任何线索)

with tf.variable_scope("decoder_scope") as decoder_scope:

    decoder_cell = tf.contrib.rnn.LSTMCell(num_units=state_size)
    decoder_cell = DtypeDropoutWrapper(cell=decoder_cell, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
    decoder_cell = contrib_rnn.MultiRNNCell(cells=[decoder_cell] * num_lstm_layers, state_is_tuple=True)   

    # define decoder train netowrk
    decoder_outputs_tr, _ , _ = dynamic_rnn_decoder(
        cell=decoder_cell, # the cell function
        decoder_fn= simple_decoder_fn_train(last_encoder_state, name=None),
        inputs=decoder_inputs,
        sequence_length=decoder_sequence_lengths,
        parallel_iterations=None,
        swap_memory=False,
        time_major=False)

谁能解释一下?

您可以使用 bidirectional_dynamic_rnn [1]

cell_fw = contrib_rnn.LSTMCell(num_units=state_size, state_is_tuple=True)
cell_fw = DtypeDropoutWrapper(cell=cell_fw, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
cell_fw = contrib_rnn.MultiRNNCell(cells=[cell_fw] * int(num_lstm_layers/2), state_is_tuple=True)

cell_bw = contrib_rnn.LSTMCell(num_units=state_size, state_is_tuple=True)
cell_bw = DtypeDropoutWrapper(cell=cell_bw, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
cell_bw = contrib_rnn.MultiRNNCell(cells=[cell_bw] * num_lstm_layers, state_is_tuple=True)

encoder_cell_fw = cell_fw
encoder_cell_bw = cell_bw

encoder_outputs, (output_state_fw, output_state_bw) = tf.nn.bidirectional_dynamic_rnn(
    cell_fw=encoder_cell_fw,
    cell_bw=encoder_cell_bw,
    dtype=DTYPE,
    sequence_length=encoder_sequence_lengths,
    inputs=encoder_inputs,
    )

last_encoder_state = [
                       tf.concat([output_state_fw[0], output_state_bw[0]], axis=-1),
                       tf.concat([output_state_fw[1], output_state_bw[1]], axis=-1)
                     ]

但是,正如 TensorFlow 文档中所说,此 API 已弃用,您应该考虑迁移到 TensorFlow2 并使用 keras.layers.Bidirectional(keras.layers.RNN(cell))

关于更新后的问题,您不能在解码器模型中使用双向,因为双向意味着它已经知道它仍然需要生成什么[2]

无论如何,为了使解码器适应双向编码器,您可以连接编码器状态并将解码器 num_units 加倍(或编码器中 num_units 的一半)[3]

decoder_cell = tf.contrib.rnn.LSTMCell(num_units=state_size)
decoder_cell = DtypeDropoutWrapper(cell=decoder_cell, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
decoder_cell = contrib_rnn.MultiRNNCell(cells=[decoder_cell] * num_lstm_layers, state_is_tuple=True)   

# define decoder train netowrk
decoder_outputs_tr, _ , _ = dynamic_rnn_decoder(
    cell=decoder_cell, # the cell function
    decoder_fn= simple_decoder_fn_train(last_encoder_state, name=None),
    inputs=decoder_inputs,
    sequence_length=decoder_sequence_lengths,
    parallel_iterations=None,
    swap_memory=False,
    time_major=False)