Tensorflow：如何为序列长度不同的 RNN 的输出添加偏差

Question

首先让我解释一下RNN的输入值和目标值。我的数据集由序列组成（例如 4、7、1、23、42、69）。 RNN 被训练来预测每个序列中的下一个值。所以除了最后一个之外的所有值都是输入，除了第一个之外的所有值都是目标值。每个值都表示为 1-HOT 向量。

我在 Tensorflow 中有一个 RNN，其中 RNN (tf.dynamic_rnn) 的输出通过前馈层发送。输入序列的长度各不相同，因此我使用 sequence_length 参数来指定批处理中每个序列的长度。 RNN 层的输出是每个时间步的输出张量。大多数序列具有相同的长度，但有些更短。当发送较短的序列时，我会得到额外的全零向量（作为填充）。

问题是我想通过前馈层发送 RNN 层的输出。如果我在这个前馈层中添加偏差，那么额外的全零向量将变为非零。没有偏差，只有权重，这很好用，因为全零向量不受乘法的影响。所以在没有偏见的情况下，我也可以将目标向量设置为全零，这样它们就不会影响向后传递。但是如果加上bias，我就不知道在padded/dummy目标向量里放什么了。

所以网络看起来像这样：

[INPUT (1-HOT vectors, one vector for each value in the sequence)]
                      V
[GRU layer (smaller size than the input layer)]
                      V
[Feedforward layer (outputs vectors of the same size as the input)]

这是代码：

# [batch_size, max_sequence_length, size of 1-HOT vectors]
x = tf.placeholder(tf.float32, [None, max_length, n_classes])
y = tf.placeholder(tf.int32, [None, max_length, n_classes])
session_length = tf.placeholder(tf.int32, [None])

outputs, state = rnn.dynamic_rnn(
    rnn_cell.GRUCell(num_hidden),
    x,
    dtype=tf.float32,
    sequence_length=session_length
    )

layer = {'weights':tf.Variable(tf.random_normal([n_hidden, n_classes])),
         'biases':tf.Variable(tf.random_normal([n_classes]))}

# Flatten to apply same weights to all timesteps
outputs = tf.reshape(outputs, [-1, n_hidden])

prediction = tf.matmul(output, layer['weights']) # + layer['bias']

error = tf.nn.softmax_cross_entropy_with_logits(prediction,y)

Answer 1

您可以添加偏差，但从损失函数中屏蔽掉不相关的序列元素。

查看来自 im2txt 项目的 example：

weights = tf.to_float(tf.reshape(self.input_mask, [-1])) # these are the masks

# Compute losses.
losses = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, targets)
batch_loss = tf.div(tf.reduce_sum(tf.mul(losses, weights)),
                      tf.reduce_sum(weights),
                      name="batch_loss") # Here the irrelevant sequence elements are masked out

此外，要生成掩码，请参阅同一项目中 ops/inputs

下的函数 batch_with_dynamic_pad

Tensorflow：如何为序列长度不同的 RNN 的输出添加偏差

Tensorflow: How to add bias to ouputs from RNN where the sequences have varying length

python

neural-network

tensorflow

recurrent-neural-network