Tensorboard - 可视化 LSTM 的权重

Tensorboard - visualize weights of LSTM

我正在使用几个 LSTM 层来形成一个深度递归神经网络。我想在训练期间监控每个 LSTM 层的权重。但是,我找不到如何将 LSTM 层权重的摘要附加到 TensorBoard。

关于如何做到这一点有什么建议吗?

代码如下:

cells = []

with tf.name_scope("cell_1"):
    cell1 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    cell1 = tf.contrib.rnn.DropoutWrapper(cell1,
                input_keep_prob=self.input_dropout,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell1)

with tf.name_scope("cell_2"):
    cell2 = tf.contrib.rnn.LSTMCell(self.n_hidden, state_is_tuple=True, initializer=self.initializer)
    cell2 = tf.contrib.rnn.DropoutWrapper(cell2,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell2)

with tf.name_scope("cell_3"):
    cell3 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    # cell has no input dropout since previous cell already has output dropout
    cell3 = tf.contrib.rnn.DropoutWrapper(cell3,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell3)

cell = tf.contrib.rnn.MultiRNNCell(
    cells, state_is_tuple=True)

output, self.final_state = tf.nn.dynamic_rnn(
    cell,
    inputs=self.inputs,
    initial_state=self.init_state)

tf.contrib.rnn.LSTMCell 对象有一个名为 variablesproperty 可以解决这个问题。只有一个技巧:属性 return 是一个空列表,直到您的单元格通过 tf.nn.dynamic_rnn。至少在使用单个 LSTMCell 时是这样。我不能代表 MultiRNNCell。所以我希望这会起作用:

output, self.final_state = tf.nn.dynamic_rnn(...)
for one_lstm_cell in cells:
    one_kernel, one_bias = one_lstm_cell.variables
    # I think TensorBoard handles summaries with the same name fine.
    tf.summary.histogram("Kernel", one_kernel)
    tf.summary.histogram("Bias", one_bias)

然后你可能知道如何从那里开始,但是

summary_op = tf.summary.merge_all()
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_writer = tf.summary.FileWriter(
        "my/preferred/logdir/train", graph=tf.get_default_graph())
    for step in range(1, training_steps+1):
        ...
        _, step_summary = sess.run([train_op, summary_op])
        train_writer.add_summary(step_summary)

查看我上面链接的 TensorFlow 文档,还有一个 weights 属性。我不知道有什么区别,如果有的话。而且,variables return 的顺序没有记录。我通过打印结果列表并查看变量名称来解决这个问题。

现在,MultiRNNCell 根据其 doc 具有相同的 variables 属性 并且它表示 returns all图层变量。老实说,我不知道 MultiRNNCell 是如何工作的,所以我无法告诉您这些变量是专门属于 MultiRNNCell 的变量,还是包含来自进入其中的单元格的变量。无论哪种方式,知道 属性 存在应该是一个很好的提示!希望这有帮助。


虽然 variables 记录了大多数(所有?)RNN 类,但它确实会破坏 DropoutWrapperproperty has been documented 自 r1.2 以来,但访问 属性 在 1.2 和 1.4 中会导致异常(看起来像 1.3,但未经测试)。具体来说,

from tensorflow.contrib import rnn
...
lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
wrapped_cell = rnn.DropoutWrapper(lstm_cell)
outputs, states = rnn.static_rnn(wrapped_cell, x, dtype=tf.float32)
print("LSTM vars!", lstm_cell.variables)
print("Wrapped vars!", wrapped_cell.variables)

将抛出 AttributeError: 'DropoutWrapper' object has no attribute 'trainable'。从追溯(或长时间盯着 DropoutWrapper source), I noticed that variables is implemented in DropoutWrapper's super RNNCell's super Layer。头晕了吗?确实,我们在这里找到了记录的 variables 属性。它 return 是(记录的)weights 属性。weights 属性 return 是(记录的)self.trainable_weights + self.non_trainable_weights 属性。最后是问题的根源:

@property
def trainable_weights(self):
    return self._trainable_weights if self.trainable else []

@property
def non_trainable_weights(self):
    if self.trainable:
        return self._non_trainable_weights
    else:
        return self._trainable_weights + self._non_trainable_weights

也就是说,variables 不适用于 DropoutWrapper 实例。 trainable_weightsnon_trainable_weights 也不会,因为 self.trainable 未定义。

更进一步,Layer.__init__ 默认 self.trainableTrue,但 DropoutWrapper 从不调用它。引用 Github

上的 TensorFlow 贡献者

DropoutWrapper does not have variables because it does not itself store any. It wraps a cell that may have variables; but it's not clear what the semantics should be if you access the DropoutWrapper.variables. For example, all keras layers only report back the variables that they own; and so only one layer ever owns any variable. That said, this should probably return [], and the reason it doesn't is that DropoutWrapper never calls super().__init__ in its constructor. That should be an easy fix; PRs welcome.

因此,例如,要访问上例中的 LSTM 变量,lstm_cell.variables 就足够了。


编辑:据我所知,Mike Khan 的 PR 已合并到 1.5 中。现在,dropout 层的变量 属性 return 是一个空列表。