将正确的图像形状传递给在同一网络中使用 conv1d 和 con2d 的模型

Pass correct image shape to model that uses a conv1d and con2d in the same network

所以我正在尝试实现 VGG 网络,论文中的所有内容,但是当我使用具有 conv1-255 作为其网络一部分的架构时,我有。下面是我的代码

def _make_convo_layers(architecture) -> torch.nn.Sequential:
        """
        Create convolutional layers from the vgg architecture type passed in.
        : param architecture:
        """
        layers = []
        in_channels = 3
        for layer in architecture:
            if type(layer) == int:
                out_channels = layer
                layers += [nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, stride=1), nn.ReLU()]
                # layers.append([nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, stride=1) + nn.ReLU()])
                in_channels = layer
            elif (layer == 'Conv1-256'):
                out_channels = 256
                layers += [nn.Conv1d(256, out_channels, kernel_size=3, padding=1, stride=1), nn.ReLU()]
            elif (layer == 'LRN'):
                layers += [nn.LocalResponseNorm(5, alpha=0.0001, beta=0.75, k=1)]
            elif (layer == 'M'):
                layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        return nn.Sequential(*layers)

下面是我将一些随机数据传递给模型

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
vgg = VGGNet(config['vgg16-C1']).to(device)
x = torch.randn(1, 3, 224, 224).to(device)
model = vgg(x).to(device)
print(model.shape)

下面是我将 x 变量传递给模型时收到的错误

RuntimeError: Expected 3-dimensional input for 3-dimensional weight [256, 256, 3], but got 4-dimensional input of size [1, 256, 56, 56] instead

任何帮助都可以,请

正如jhso评论的使用做错了,看这个VGG页面上的VGG解释,你需要做的不是使用一维卷积而是对内核进行卷积运算大小 1 而不是使用原始内核大小 3。

def _make_convo_layers(architecture) -> torch.nn.Sequential:
        """
        Create convolutional layers from the vgg architecture type passed in.
        : param architecture:
        """
        layers = []
        in_channels = 3
        for layer in architecture:
            if type(layer) == int:
                out_channels = layer
                layers += [nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, stride=1), nn.ReLU()]
                in_channels = layer
            elif (layer == 'Conv1-256'):
                out_channels = 256
                layers += [nn.Conv1d(256, out_channels, kernel_size=1, padding=1, stride=1), nn.ReLU()]
            elif (layer == 'LRN'):
                layers += [nn.LocalResponseNorm(5, alpha=0.0001, beta=0.75, k=1)]
            elif (layer == 'M'):
                layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        return nn.Sequential(*layers)