堆叠 LSTM 单元的所有输出的总和 - Tensorflow
Sum of all outputs of stacked LSTM cells - Tensorflow
在这里您可能会看到 TensorFlow 中几个堆叠 LSTM 单元的标准实现
with tf.name_scope("RNN_layers"):
def lstm_cell():
lstm = tf.contrib.rnn.LayerNormBasicLSTMCell(lstm_size)
return lstm
cell = tf.contrib.rnn.MultiRNNCell([lstm_cell() for _ in range(num_layers)])
with tf.name_scope("RNN_init_state"):
initial_state = cell.zero_state(batch_size, tf.float32)
with tf.name_scope("RNN_forward"):
outputs, state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state)
这对于大量任务来说非常有效。然而,对于其他人,一些专家建议 将堆中单元格的所有输出的总和作为最终输出,沿着 num_layers
方向,而不仅仅是最后一个单元格的输出 .
在下图中,要求是y_t=h_t^1+h_t^2+h_t^3
哪种方法是在 TensorFlow 中实现它的最明智的方法?
您在 outputs
上从 tf.nn.dynamic_rnn
is the list of outputs of all cells. If you want to compute the sum of them, just call tf.reduce_sum
获得的 outputs
张量:
n_steps = 2
n_inputs = 3
n_neurons = 5
X = tf.placeholder(dtype=tf.float32, shape=[None, n_steps, n_inputs])
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32)
# outputs = [?, n_steps, n_neurons], e.g. outputs from all cells
sum = tf.reduce_sum(outputs, axis=1)
# sum = [?, n_neurons]
在 MultiRNNCell
的情况下,它将是最后一层输出的总和,这也是您通常想要的。
更新:
汇总 隐藏层 中的张量会更加困难,因为 tensorflow MultiRNNCell
对每个单元格的输出重复使用相同的张量,因此隐藏层永远不会从 RNN 中暴露出来。
最简单的解决方案是编写自己的 MultiRNNCell
来总结每一层的输出,而不是只记住最后一层。方法如下:
from tensorflow.python.util import nest
class MyMultiRNNCell(tf.nn.rnn_cell.MultiRNNCell):
def call(self, inputs, state):
cur_state_pos = 0
cur_inp = inputs
new_states = []
new_outputs = []
for i, cell in enumerate(self._cells):
with tf.variable_scope("cell_%d" % i):
if self._state_is_tuple:
if not nest.is_sequence(state):
raise ValueError("Expected state to be a tuple of length %d, but received: %s" %
(len(self.state_size), state))
cur_state = state[i]
else:
cur_state = tf.slice(state, [0, cur_state_pos], [-1, cell.state_size])
cur_state_pos += cell.state_size
cur_inp, new_state = cell(cur_inp, cur_state)
new_states.append(new_state)
new_outputs.append(cur_inp)
new_states = (tuple(new_states) if self._state_is_tuple else
tf.concat(new_states, 1))
new_outputs_sum = tf.reduce_sum(new_outputs, axis=0)
return new_outputs_sum, new_states
在这里您可能会看到 TensorFlow 中几个堆叠 LSTM 单元的标准实现
with tf.name_scope("RNN_layers"):
def lstm_cell():
lstm = tf.contrib.rnn.LayerNormBasicLSTMCell(lstm_size)
return lstm
cell = tf.contrib.rnn.MultiRNNCell([lstm_cell() for _ in range(num_layers)])
with tf.name_scope("RNN_init_state"):
initial_state = cell.zero_state(batch_size, tf.float32)
with tf.name_scope("RNN_forward"):
outputs, state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state)
这对于大量任务来说非常有效。然而,对于其他人,一些专家建议 将堆中单元格的所有输出的总和作为最终输出,沿着 num_layers
方向,而不仅仅是最后一个单元格的输出 .
在下图中,要求是y_t=h_t^1+h_t^2+h_t^3
哪种方法是在 TensorFlow 中实现它的最明智的方法?
您在 outputs
上从 tf.nn.dynamic_rnn
is the list of outputs of all cells. If you want to compute the sum of them, just call tf.reduce_sum
获得的 outputs
张量:
n_steps = 2
n_inputs = 3
n_neurons = 5
X = tf.placeholder(dtype=tf.float32, shape=[None, n_steps, n_inputs])
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32)
# outputs = [?, n_steps, n_neurons], e.g. outputs from all cells
sum = tf.reduce_sum(outputs, axis=1)
# sum = [?, n_neurons]
在 MultiRNNCell
的情况下,它将是最后一层输出的总和,这也是您通常想要的。
更新:
汇总 隐藏层 中的张量会更加困难,因为 tensorflow MultiRNNCell
对每个单元格的输出重复使用相同的张量,因此隐藏层永远不会从 RNN 中暴露出来。
最简单的解决方案是编写自己的 MultiRNNCell
来总结每一层的输出,而不是只记住最后一层。方法如下:
from tensorflow.python.util import nest
class MyMultiRNNCell(tf.nn.rnn_cell.MultiRNNCell):
def call(self, inputs, state):
cur_state_pos = 0
cur_inp = inputs
new_states = []
new_outputs = []
for i, cell in enumerate(self._cells):
with tf.variable_scope("cell_%d" % i):
if self._state_is_tuple:
if not nest.is_sequence(state):
raise ValueError("Expected state to be a tuple of length %d, but received: %s" %
(len(self.state_size), state))
cur_state = state[i]
else:
cur_state = tf.slice(state, [0, cur_state_pos], [-1, cell.state_size])
cur_state_pos += cell.state_size
cur_inp, new_state = cell(cur_inp, cur_state)
new_states.append(new_state)
new_outputs.append(cur_inp)
new_states = (tuple(new_states) if self._state_is_tuple else
tf.concat(new_states, 1))
new_outputs_sum = tf.reduce_sum(new_outputs, axis=0)
return new_outputs_sum, new_states