在 TensorFlow 中获取 dynamic_rnn 的最后一个输出
Get the last output of a dynamic_rnn in TensorFlow
我有一个形状为 [batch, None, dim]
的 3-D 张量,其中第二维(即时间步长)是未知的。我使用 dynamic_rnn
来处理此类输入,如以下代码片段所示:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
实际上,运行 剪下了一些实际数字,我得到了一些合理的结果:
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_ = sess.run(output, {inputs: inputs_, lengths: lengths_})
print(output_)
输出为:
[[[ 0. 0. 0. 0. ]
[ 0.02188676 -0.01294564 0.05340237 -0.47148666]
[ 0.0343586 -0.02243731 0.0870839 -0.89869428]
[ 0. 0. 0. 0. ]]
[[ 0.00284752 -0.00315077 0.00108094 -0.99883419]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
有没有办法用动态 RNN 的 上次相关输出 获得形状 [batch, 1, hidden]
的 3-D 张量?谢谢!
实际上,解决方案并没有那么难。我实现了以下代码:
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(rnn_out, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
因此,完整的代码段如下:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(output, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outputs = sess.run([output, last], {inputs: inputs_, lengths: lengths_})
print 'RNN output:'
print(outputs[0])
print
print 'last relevant output:'
print(outputs[1])
并且输出:
RNN output:
[[[ 0. 0. 0. 0. ]
[-0.06667092 -0.09284072 0.01098599 -0.03676109]
[-0.09101103 -0.19828682 0.03546784 -0.08721405]
[ 0. 0. 0. 0. ]]
[[-0.00025157 -0.05704876 0.05527233 -0.03741353]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
last relevant output:
[[[-0.09101103 -0.19828682 0.03546784]]
[[-0.00025157 -0.05704876 0.05527233]]]
这就是 gather_nd 的用途!
def extract_axis_1(data, ind):
"""
Get specified elements along the first axis of tensor.
:param data: Tensorflow tensor that will be subsetted.
:param ind: Indices to take (one for each element along axis 0 of data).
:return: Subsetted tensor.
"""
batch_range = tf.range(tf.shape(data)[0])
indices = tf.stack([batch_range, ind], axis=1)
res = tf.gather_nd(data, indices)
return res
你的情况:
output = extract_axis_1(output, lengths - 1)
现在output
是一个维度为[batch_size, num_cells]
的张量。
来自以下两个来源,
http://www.wildml.com/2016/08/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
outputs, last_states = tf.nn.dynamic_rnn(
cell=cell,
dtype=tf.float64,
sequence_length=X_lengths,
inputs=X)
或https://github.com/ageron/handson-ml/blob/master/14_recurrent_neural_networks.ipynb,
显然可以从 dynamic_rnn 调用的第二个输出中直接提取 last_states。它将为您提供跨 all 层的 last_states(在 LSTM 中,它由 LSTMStateTuple 组成),而输出包含 last[= 中的所有状态23=]层.
好的 - 所以,看起来实际上 是 一个更简单的解决方案。正如@Shao Tang 和@Rahul 所提到的,执行此操作的首选方法是访问最终单元格状态。原因如下:
- 如果您查看 GRUCell 源代码(下方),您会发现单元保持的“状态”实际上是隐藏的权重本身。因此,当
tf.nn.dynamic_rnn
returns 最终状态时,它实际上返回了您感兴趣的最终隐藏权重。为了证明这一点,我只是调整了您的设置并得到了结果:
GRUCell 调用(rnn_cell_impl.py):
def call(self, inputs, state):
"""Gated recurrent unit (GRU) with nunits cells."""
if self._gate_linear is None:
bias_ones = self._bias_initializer
if self._bias_initializer is None:
bias_ones = init_ops.constant_initializer(1.0, dtype=inputs.dtype)
with vs.variable_scope("gates"): # Reset gate and update gate.
self._gate_linear = _Linear(
[inputs, state],
2 * self._num_units,
True,
bias_initializer=bias_ones,
kernel_initializer=self._kernel_initializer)
value = math_ops.sigmoid(self._gate_linear([inputs, state]))
r, u = array_ops.split(value=value, num_or_size_splits=2, axis=1)
r_state = r * state
if self._candidate_linear is None:
with vs.variable_scope("candidate"):
self._candidate_linear = _Linear(
[inputs, r_state],
self._num_units,
True,
bias_initializer=self._bias_initializer,
kernel_initializer=self._kernel_initializer)
c = self._activation(self._candidate_linear([inputs, r_state]))
new_h = u * state + (1 - u) * c
return new_h, new_h
解决方案:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, state = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_, state_ = sess.run([output, state], {inputs: inputs_, lengths: lengths_})
print (output_)
print (state_)
输出:
[[[ 0. 0. 0. 0. ]
[-0.24305521 -0.15512943 0.06614969 0.16873555]
[-0.62767833 -0.30741733 0.14819752 0.44313088]
[ 0. 0. 0. 0. ]]
[[-0.99152333 -0.1006391 0.28767768 0.76360202]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
[[-0.62767833 -0.30741733 0.14819752 0.44313088]
[-0.99152333 -0.1006391 0.28767768 0.76360202]]
对于使用 LSTMCell(另一个流行选项)的其他读者来说,事情的工作方式略有不同。 LSTMCell 以不同的方式维护状态 - 单元状态是元组或实际单元状态和隐藏状态的串联版本。因此,要访问最终的隐藏权重,您可以在单元初始化期间设置 (is_state_tuple
到 True
),最终状态将是一个元组:(最终单元状态,最终隐藏权重)。所以,在这种情况下,
_, (_, h) = tf.nn.dynamic_rnn(单元格, 输入, 长度, initial_state=cell_state)
会给你最后的权重。
参考资料:
https://github.com/tensorflow/tensorflow/blob/438604fc885208ee05f9eef2d0f2c630e1360a83/tensorflow/python/ops/rnn_cell_impl.py#L308
https://github.com/tensorflow/tensorflow/blob/438604fc885208ee05f9eef2d0f2c630e1360a83/tensorflow/python/ops/rnn_cell_impl.py#L415
我有一个形状为 [batch, None, dim]
的 3-D 张量,其中第二维(即时间步长)是未知的。我使用 dynamic_rnn
来处理此类输入,如以下代码片段所示:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
实际上,运行 剪下了一些实际数字,我得到了一些合理的结果:
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_ = sess.run(output, {inputs: inputs_, lengths: lengths_})
print(output_)
输出为:
[[[ 0. 0. 0. 0. ]
[ 0.02188676 -0.01294564 0.05340237 -0.47148666]
[ 0.0343586 -0.02243731 0.0870839 -0.89869428]
[ 0. 0. 0. 0. ]]
[[ 0.00284752 -0.00315077 0.00108094 -0.99883419]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
有没有办法用动态 RNN 的 上次相关输出 获得形状 [batch, 1, hidden]
的 3-D 张量?谢谢!
实际上,解决方案并没有那么难。我实现了以下代码:
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(rnn_out, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
因此,完整的代码段如下:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(output, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outputs = sess.run([output, last], {inputs: inputs_, lengths: lengths_})
print 'RNN output:'
print(outputs[0])
print
print 'last relevant output:'
print(outputs[1])
并且输出:
RNN output:
[[[ 0. 0. 0. 0. ]
[-0.06667092 -0.09284072 0.01098599 -0.03676109]
[-0.09101103 -0.19828682 0.03546784 -0.08721405]
[ 0. 0. 0. 0. ]]
[[-0.00025157 -0.05704876 0.05527233 -0.03741353]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
last relevant output:
[[[-0.09101103 -0.19828682 0.03546784]]
[[-0.00025157 -0.05704876 0.05527233]]]
这就是 gather_nd 的用途!
def extract_axis_1(data, ind):
"""
Get specified elements along the first axis of tensor.
:param data: Tensorflow tensor that will be subsetted.
:param ind: Indices to take (one for each element along axis 0 of data).
:return: Subsetted tensor.
"""
batch_range = tf.range(tf.shape(data)[0])
indices = tf.stack([batch_range, ind], axis=1)
res = tf.gather_nd(data, indices)
return res
你的情况:
output = extract_axis_1(output, lengths - 1)
现在output
是一个维度为[batch_size, num_cells]
的张量。
来自以下两个来源,
http://www.wildml.com/2016/08/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
outputs, last_states = tf.nn.dynamic_rnn(
cell=cell,
dtype=tf.float64,
sequence_length=X_lengths,
inputs=X)
或https://github.com/ageron/handson-ml/blob/master/14_recurrent_neural_networks.ipynb,
显然可以从 dynamic_rnn 调用的第二个输出中直接提取 last_states。它将为您提供跨 all 层的 last_states(在 LSTM 中,它由 LSTMStateTuple 组成),而输出包含 last[= 中的所有状态23=]层.
好的 - 所以,看起来实际上 是 一个更简单的解决方案。正如@Shao Tang 和@Rahul 所提到的,执行此操作的首选方法是访问最终单元格状态。原因如下:
- 如果您查看 GRUCell 源代码(下方),您会发现单元保持的“状态”实际上是隐藏的权重本身。因此,当
tf.nn.dynamic_rnn
returns 最终状态时,它实际上返回了您感兴趣的最终隐藏权重。为了证明这一点,我只是调整了您的设置并得到了结果:
GRUCell 调用(rnn_cell_impl.py):
def call(self, inputs, state):
"""Gated recurrent unit (GRU) with nunits cells."""
if self._gate_linear is None:
bias_ones = self._bias_initializer
if self._bias_initializer is None:
bias_ones = init_ops.constant_initializer(1.0, dtype=inputs.dtype)
with vs.variable_scope("gates"): # Reset gate and update gate.
self._gate_linear = _Linear(
[inputs, state],
2 * self._num_units,
True,
bias_initializer=bias_ones,
kernel_initializer=self._kernel_initializer)
value = math_ops.sigmoid(self._gate_linear([inputs, state]))
r, u = array_ops.split(value=value, num_or_size_splits=2, axis=1)
r_state = r * state
if self._candidate_linear is None:
with vs.variable_scope("candidate"):
self._candidate_linear = _Linear(
[inputs, r_state],
self._num_units,
True,
bias_initializer=self._bias_initializer,
kernel_initializer=self._kernel_initializer)
c = self._activation(self._candidate_linear([inputs, r_state]))
new_h = u * state + (1 - u) * c
return new_h, new_h
解决方案:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, state = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_, state_ = sess.run([output, state], {inputs: inputs_, lengths: lengths_})
print (output_)
print (state_)
输出:
[[[ 0. 0. 0. 0. ]
[-0.24305521 -0.15512943 0.06614969 0.16873555]
[-0.62767833 -0.30741733 0.14819752 0.44313088]
[ 0. 0. 0. 0. ]]
[[-0.99152333 -0.1006391 0.28767768 0.76360202]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
[[-0.62767833 -0.30741733 0.14819752 0.44313088]
[-0.99152333 -0.1006391 0.28767768 0.76360202]]
对于使用 LSTMCell(另一个流行选项)的其他读者来说,事情的工作方式略有不同。 LSTMCell 以不同的方式维护状态 - 单元状态是元组或实际单元状态和隐藏状态的串联版本。因此,要访问最终的隐藏权重,您可以在单元初始化期间设置 (
is_state_tuple
到True
),最终状态将是一个元组:(最终单元状态,最终隐藏权重)。所以,在这种情况下,_, (_, h) = tf.nn.dynamic_rnn(单元格, 输入, 长度, initial_state=cell_state)
会给你最后的权重。
参考资料: