全连接 1 (fc1) 层的输入展平值来自哪里(MNIST 示例)
Where do the input flatten value come from in fully connected 1 (fc1) layer (MNIST Example)
这是 Pytorch 示例目录中的一些卷积神经网络示例代码 github:
https://github.com/pytorch/examples/blob/master/mnist/main.py
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
如果我理解这一点,我们需要先将最后一个卷积层的输出展平,然后才能将其传递给线性层 (fc1)。所以,看这段代码,我们看到第一个全连接层的输入是:9216.
这个数字 (9216) 从哪里来的?
您还需要查看 forward
方法和网络输入形状,以便计算 linear/fully-connected 层的输入形状。对于 MNIST,我们有一个单通道 28x28 输入图像。使用 docs you can compute the output shape of each convolution operation. The max-pooling 操作中的以下公式遵循与卷积层相同的输入-输出关系。
由于展平前输入的形状是一个64通道的12x12特征图,那么特征总大小为64*12*12 = 9216
.
Input/Output conv2d 和 max_pool2d 操作的关系
def forward(self, x):
""" For each line which changes the feature shape additional comment
indicates <input_shape> -> <output_shape> """
x = self.conv1(x) # [1, 28, 28] -> [32, 26, 26]
x = F.relu(x)
x = self.conv2(x) # [32, 26, 26] -> [64, 24, 24]
x = F.relu(x)
x = F.max_pool2d(x, 2) # [64, 24, 24] -> [64, 12, 12]
x = self.dropout1(x)
x = torch.flatten(x, 1) # [64, 12, 12] -> [9216]
x = self.fc1(x) # [9216] -> [128]
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x) # [128] -> [10]
output = F.log_softmax(x, dim=1)
return output
这是 Pytorch 示例目录中的一些卷积神经网络示例代码 github: https://github.com/pytorch/examples/blob/master/mnist/main.py
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
如果我理解这一点,我们需要先将最后一个卷积层的输出展平,然后才能将其传递给线性层 (fc1)。所以,看这段代码,我们看到第一个全连接层的输入是:9216.
这个数字 (9216) 从哪里来的?
您还需要查看 forward
方法和网络输入形状,以便计算 linear/fully-connected 层的输入形状。对于 MNIST,我们有一个单通道 28x28 输入图像。使用 docs you can compute the output shape of each convolution operation. The max-pooling 操作中的以下公式遵循与卷积层相同的输入-输出关系。
由于展平前输入的形状是一个64通道的12x12特征图,那么特征总大小为64*12*12 = 9216
.
Input/Output conv2d 和 max_pool2d 操作的关系
def forward(self, x):
""" For each line which changes the feature shape additional comment
indicates <input_shape> -> <output_shape> """
x = self.conv1(x) # [1, 28, 28] -> [32, 26, 26]
x = F.relu(x)
x = self.conv2(x) # [32, 26, 26] -> [64, 24, 24]
x = F.relu(x)
x = F.max_pool2d(x, 2) # [64, 24, 24] -> [64, 12, 12]
x = self.dropout1(x)
x = torch.flatten(x, 1) # [64, 12, 12] -> [9216]
x = self.fc1(x) # [9216] -> [128]
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x) # [128] -> [10]
output = F.log_softmax(x, dim=1)
return output