Tensorboard - 可视化 LSTM 的权重
Tensorboard - visualize weights of LSTM
我正在使用几个 LSTM 层来形成一个深度递归神经网络。我想在训练期间监控每个 LSTM 层的权重。但是,我找不到如何将 LSTM 层权重的摘要附加到 TensorBoard。
关于如何做到这一点有什么建议吗?
代码如下:
cells = []
with tf.name_scope("cell_1"):
cell1 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
cell1 = tf.contrib.rnn.DropoutWrapper(cell1,
input_keep_prob=self.input_dropout,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell1)
with tf.name_scope("cell_2"):
cell2 = tf.contrib.rnn.LSTMCell(self.n_hidden, state_is_tuple=True, initializer=self.initializer)
cell2 = tf.contrib.rnn.DropoutWrapper(cell2,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell2)
with tf.name_scope("cell_3"):
cell3 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
# cell has no input dropout since previous cell already has output dropout
cell3 = tf.contrib.rnn.DropoutWrapper(cell3,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell3)
cell = tf.contrib.rnn.MultiRNNCell(
cells, state_is_tuple=True)
output, self.final_state = tf.nn.dynamic_rnn(
cell,
inputs=self.inputs,
initial_state=self.init_state)
tf.contrib.rnn.LSTMCell
对象有一个名为 variables
的 property 可以解决这个问题。只有一个技巧:属性 return 是一个空列表,直到您的单元格通过 tf.nn.dynamic_rnn
。至少在使用单个 LSTMCell 时是这样。我不能代表 MultiRNNCell
。所以我希望这会起作用:
output, self.final_state = tf.nn.dynamic_rnn(...)
for one_lstm_cell in cells:
one_kernel, one_bias = one_lstm_cell.variables
# I think TensorBoard handles summaries with the same name fine.
tf.summary.histogram("Kernel", one_kernel)
tf.summary.histogram("Bias", one_bias)
然后你可能知道如何从那里开始,但是
summary_op = tf.summary.merge_all()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
train_writer = tf.summary.FileWriter(
"my/preferred/logdir/train", graph=tf.get_default_graph())
for step in range(1, training_steps+1):
...
_, step_summary = sess.run([train_op, summary_op])
train_writer.add_summary(step_summary)
查看我上面链接的 TensorFlow 文档,还有一个 weights
属性。我不知道有什么区别,如果有的话。而且,variables
return 的顺序没有记录。我通过打印结果列表并查看变量名称来解决这个问题。
现在,MultiRNNCell
根据其 doc 具有相同的 variables
属性 并且它表示 returns all图层变量。老实说,我不知道 MultiRNNCell
是如何工作的,所以我无法告诉您这些变量是专门属于 MultiRNNCell
的变量,还是包含来自进入其中的单元格的变量。无论哪种方式,知道 属性 存在应该是一个很好的提示!希望这有帮助。
虽然 variables
记录了大多数(所有?)RNN 类,但它确实会破坏 DropoutWrapper
。 property has been documented 自 r1.2 以来,但访问 属性 在 1.2 和 1.4 中会导致异常(看起来像 1.3,但未经测试)。具体来说,
from tensorflow.contrib import rnn
...
lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
wrapped_cell = rnn.DropoutWrapper(lstm_cell)
outputs, states = rnn.static_rnn(wrapped_cell, x, dtype=tf.float32)
print("LSTM vars!", lstm_cell.variables)
print("Wrapped vars!", wrapped_cell.variables)
将抛出 AttributeError: 'DropoutWrapper' object has no attribute 'trainable'
。从追溯(或长时间盯着 DropoutWrapper source), I noticed that variables
is implemented in DropoutWrapper's super RNNCell
's super Layer
。头晕了吗?确实,我们在这里找到了记录的 variables
属性。它 return 是(记录的)weights
属性。weights
属性 return 是(记录的)self.trainable_weights + self.non_trainable_weights
属性。最后是问题的根源:
@property
def trainable_weights(self):
return self._trainable_weights if self.trainable else []
@property
def non_trainable_weights(self):
if self.trainable:
return self._non_trainable_weights
else:
return self._trainable_weights + self._non_trainable_weights
也就是说,variables
不适用于 DropoutWrapper
实例。 trainable_weights
或 non_trainable_weights
也不会,因为 self.trainable
未定义。
更进一步,Layer.__init__
默认 self.trainable
为 True
,但 DropoutWrapper
从不调用它。引用 Github、
上的 TensorFlow 贡献者
DropoutWrapper
does not have variables because it does not itself store any. It wraps a cell that may have variables; but it's not clear what the semantics should be if you access the DropoutWrapper.variables
. For example, all keras layers only report back the variables that they own; and so only one layer ever owns any variable. That said, this should probably return []
, and the reason it doesn't is that DropoutWrapper never calls super().__init__
in its constructor. That should be an easy fix; PRs welcome.
因此,例如,要访问上例中的 LSTM 变量,lstm_cell.variables
就足够了。
编辑:据我所知,Mike Khan 的 PR 已合并到 1.5 中。现在,dropout 层的变量 属性 return 是一个空列表。
我正在使用几个 LSTM 层来形成一个深度递归神经网络。我想在训练期间监控每个 LSTM 层的权重。但是,我找不到如何将 LSTM 层权重的摘要附加到 TensorBoard。
关于如何做到这一点有什么建议吗?
代码如下:
cells = []
with tf.name_scope("cell_1"):
cell1 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
cell1 = tf.contrib.rnn.DropoutWrapper(cell1,
input_keep_prob=self.input_dropout,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell1)
with tf.name_scope("cell_2"):
cell2 = tf.contrib.rnn.LSTMCell(self.n_hidden, state_is_tuple=True, initializer=self.initializer)
cell2 = tf.contrib.rnn.DropoutWrapper(cell2,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell2)
with tf.name_scope("cell_3"):
cell3 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
# cell has no input dropout since previous cell already has output dropout
cell3 = tf.contrib.rnn.DropoutWrapper(cell3,
output_keep_prob=self.output_dropout,
state_keep_prob=self.recurrent_dropout)
cells.append(cell3)
cell = tf.contrib.rnn.MultiRNNCell(
cells, state_is_tuple=True)
output, self.final_state = tf.nn.dynamic_rnn(
cell,
inputs=self.inputs,
initial_state=self.init_state)
tf.contrib.rnn.LSTMCell
对象有一个名为 variables
的 property 可以解决这个问题。只有一个技巧:属性 return 是一个空列表,直到您的单元格通过 tf.nn.dynamic_rnn
。至少在使用单个 LSTMCell 时是这样。我不能代表 MultiRNNCell
。所以我希望这会起作用:
output, self.final_state = tf.nn.dynamic_rnn(...)
for one_lstm_cell in cells:
one_kernel, one_bias = one_lstm_cell.variables
# I think TensorBoard handles summaries with the same name fine.
tf.summary.histogram("Kernel", one_kernel)
tf.summary.histogram("Bias", one_bias)
然后你可能知道如何从那里开始,但是
summary_op = tf.summary.merge_all()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
train_writer = tf.summary.FileWriter(
"my/preferred/logdir/train", graph=tf.get_default_graph())
for step in range(1, training_steps+1):
...
_, step_summary = sess.run([train_op, summary_op])
train_writer.add_summary(step_summary)
查看我上面链接的 TensorFlow 文档,还有一个 weights
属性。我不知道有什么区别,如果有的话。而且,variables
return 的顺序没有记录。我通过打印结果列表并查看变量名称来解决这个问题。
现在,MultiRNNCell
根据其 doc 具有相同的 variables
属性 并且它表示 returns all图层变量。老实说,我不知道 MultiRNNCell
是如何工作的,所以我无法告诉您这些变量是专门属于 MultiRNNCell
的变量,还是包含来自进入其中的单元格的变量。无论哪种方式,知道 属性 存在应该是一个很好的提示!希望这有帮助。
虽然 variables
记录了大多数(所有?)RNN 类,但它确实会破坏 DropoutWrapper
。 property has been documented 自 r1.2 以来,但访问 属性 在 1.2 和 1.4 中会导致异常(看起来像 1.3,但未经测试)。具体来说,
from tensorflow.contrib import rnn
...
lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
wrapped_cell = rnn.DropoutWrapper(lstm_cell)
outputs, states = rnn.static_rnn(wrapped_cell, x, dtype=tf.float32)
print("LSTM vars!", lstm_cell.variables)
print("Wrapped vars!", wrapped_cell.variables)
将抛出 AttributeError: 'DropoutWrapper' object has no attribute 'trainable'
。从追溯(或长时间盯着 DropoutWrapper source), I noticed that variables
is implemented in DropoutWrapper's super RNNCell
's super Layer
。头晕了吗?确实,我们在这里找到了记录的 variables
属性。它 return 是(记录的)weights
属性。weights
属性 return 是(记录的)self.trainable_weights + self.non_trainable_weights
属性。最后是问题的根源:
@property
def trainable_weights(self):
return self._trainable_weights if self.trainable else []
@property
def non_trainable_weights(self):
if self.trainable:
return self._non_trainable_weights
else:
return self._trainable_weights + self._non_trainable_weights
也就是说,variables
不适用于 DropoutWrapper
实例。 trainable_weights
或 non_trainable_weights
也不会,因为 self.trainable
未定义。
更进一步,Layer.__init__
默认 self.trainable
为 True
,但 DropoutWrapper
从不调用它。引用 Github、
DropoutWrapper
does not have variables because it does not itself store any. It wraps a cell that may have variables; but it's not clear what the semantics should be if you access theDropoutWrapper.variables
. For example, all keras layers only report back the variables that they own; and so only one layer ever owns any variable. That said, this should probably return[]
, and the reason it doesn't is that DropoutWrapper never callssuper().__init__
in its constructor. That should be an easy fix; PRs welcome.
因此,例如,要访问上例中的 LSTM 变量,lstm_cell.variables
就足够了。
编辑:据我所知,Mike Khan 的 PR 已合并到 1.5 中。现在,dropout 层的变量 属性 return 是一个空列表。