在pytorch中用前馈网络构建循环神经网络

Question

我正在学习 this 教程。我对以下 class 代码有疑问：

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size

        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax()

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def init_hidden(self):
        return Variable(torch.zeros(1, self.hidden_size))

此代码取自 Here。那里提到

Since the state of the network is held in the graph and not in the layers, you can simply create an nn.Linear and reuse it over and over again for the recurrence.

我不明白的是，怎么能在 nn.Linear 中增加输入特征大小并说它是 RNN。我在这里错过了什么？

Answer 1

网络是循环的，因为您在示例中评估了多个时间步长。以下代码也摘自pytorch tutorial you linked to.

loss_fn = nn.MSELoss()

batch_size = 10
TIMESTEPS = 5

# Create some fake data
batch = torch.randn(batch_size, 50)
hidden = torch.zeros(batch_size, 20)
target = torch.zeros(batch_size, 10)
loss = 0
for t in range(TIMESTEPS):
    # yes! you can reuse the same network several times,
    # sum up the losses, and call backward!
    hidden, output = rnn(batch, hidden)
    loss += loss_fn(output, target)
loss.backward()

所以网络本身不是循环的，但在这个循环中，你通过多次将前一步的隐藏状态与你的批输入一起提供，将它用作循环网络。

您也可以通过在每一步中反向传播损失并忽略隐藏状态来非循环使用它。

Since the state of the network is held in the graph and not in the layers, you can simply create an nn.Linear and reuse it over and over again for the recurrence.

这意味着，计算梯度的信息不保存在模型本身中，因此您可以将模块的多个评估附加到图中，然后通过整个图反向传播。这在教程的前几段中有描述。

在pytorch中用前馈网络构建循环神经网络

Building recurrent neural network with feed forward network in pytorch

python

deep-learning

recurrent-neural-network

pytorch