关于张量流中变量范围的名称

Question

最近我一直在尝试学习使用TensorFlow，但我不明白变量作用域是如何工作的。特别是，我有以下问题：

import tensorflow as tf
from tensorflow.models.rnn import rnn_cell
from tensorflow.models.rnn import rnn

inputs = [tf.placeholder(tf.float32,shape=[10,10]) for _ in range(5)]
cell = rnn_cell.BasicLSTMCell(10)
outpts, states = rnn.rnn(cell, inputs, dtype=tf.float32)

print outpts[2].name
# ==> u'RNN/BasicLSTMCell_2/mul_2:0'

'BasicLSTMCell_2'中的'_2'从何而来？稍后使用 tf.get_variable(reuse=True) 再次获取相同变量时如何工作？

编辑：我想我发现了一个相关的问题：

def creating(s):
    with tf.variable_scope('test'):
        with tf.variable_scope('inner'):
            a=tf.get_variable(s,[1])
    return a

def creating_mod(s):
    with tf.variable_scope('test'):
        with tf.variable_scope('inner'):
            a=tf.Variable(0.0, name=s)
    return a

tf.ops.reset_default_graph()
a=creating('a')
b=creating_mod('b')
c=creating('c')
d=creating_mod('d')

print a.name, '\n', b.name,'\n', c.name,'\n', d.name

输出为

test/inner/a:0 
test_1/inner/b:0 
test/inner/c:0 
test_3/inner/d:0

我很困惑...

Answer 1

"BasicLSTMCell_2" 中的 "_2" 与 name scope in which the op outpts[2] was created. Every time you create a new name scope (with tf.name_scope()) or variable scope (with tf.variable_scope()) 相关的是，根据给定的字符串，一个唯一的后缀被添加到当前的名称范围，可能还有一个额外的后缀来制作它独一无二。对 rnn.rnn(...) 的调用具有以下伪代码（为清楚起见，已简化并使用 public API 方法）：

outputs = []
with tf.variable_scope("RNN"):
  for timestep, input_t in enumerate(inputs):
    if timestep > 0:
      tf.get_variable_scope().reuse_variables()
    with tf.variable_scope("BasicLSTMCell"):
      outputs.append(...)
return outputs

如果您查看 outpts 中张量的名称，您会发现它们如下所示：

>>> print [o.name for o in outpts]
[u'RNN/BasicLSTMCell/mul_2:0',
 u'RNN/BasicLSTMCell_1/mul_2:0',
 u'RNN/BasicLSTMCell_2/mul_2:0',
 u'RNN/BasicLSTMCell_3/mul_2:0',
 u'RNN/BasicLSTMCell_4/mul_2:0']

当您输入新的名称范围（通过输入 with tf.name_scope("..."): or with tf.variable_scope("..."): 块）时，TensorFlow 会为该范围创建一个新的、唯一的 名称。第一次进入 "BasicLSTMCell" 范围时，TensorFlow 会逐字使用该名称，因为之前没有使用过（在 "RNN/" 范围内）。下一次，TensorFlow 将 "_1" 附加到范围以使其唯一，依此类推直至 "RNN/BasicLSTMCell_4".

变量作用域和名称作用域之间的主要区别是变量作用域还有一组 name-to-tf.Variable 绑定。通过调用 tf.get_variable_scope().reuse_variables()，我们指示 TensorFlow 在时间步 0 之后重用而不是为 "RNN/" 范围（及其子级）创建变量。这确保了权重在多个 RNN 单元之间正确共享。

Answer 2

上面的回答有些误导。

让我来回答为什么你有两个不同的作用域名称，即使看起来你定义了两个相同的函数：creating 和 creating_mod.

这只是因为您使用 tf.Variable(0.0, name=s) 在函数 creating_mod 中创建了变量。

如果你想让你的变量被作用域识别，请总是使用tf.get_variable！

查看此 issue 了解更多详情。

谢谢！

关于张量流中变量范围的名称

About names of variable scope in tensorflow

tensorflow