Caffe：可变输入图像大小

Question

我正在尝试使用 Caffe 的 Google 的 deepdream code。他们使用在 ImageNet 上预训练的 GoogLeNet 模型，由 ModelZoo 提供。这意味着网络是在裁剪为 224x224 像素大小的图像上训练的。来自 train_val.prototext:

layer {            
  name: "data"     
  type: "Data"     
  ...

  transform_param {
     mirror: true   
     crop_size: 224
  ...

用于处理的deploy.prototext还定义了一个大小为224x224x3x10的输入层（RGB图像大小为224x224，batchsize为10）。

name: "GoogleNet"
input: "data"
input_shape {
  dim: 10
  dim: 3
  dim: 224
  dim: 224
}

但是我可以使用这个网络来处理任何尺寸的图像（上面的示例使用了 1024x574 像素之一）。

deploy.prototext没有配置caffe使用裁剪
deepdream code中的预处理只做了去污，这里也没有裁剪

我怎么可能运行处理对于输入层来说太大的图像？

可以找到完整的代码here

Answer 1

DeepDream 根本不裁剪输入图像。
如果您密切注意，您会注意到它在 mid-level 层上运行：它的 end= 参数设置为 'inception_4c/output' 或 end='inception_3b/5x5_reduce'，但永远不会 end='loss3/classifier' .原因是直到这些层的 GoogLeNet 是一个 fully-convolutional 网络，也就是说，它可以接受任何大小的输入图像并产生与输入大小成比例的输出大小（输出大小通常受 conv padding 和 pooling 的影响）。

要将网络调整为不同大小的输入，函数 deepdream 有行

src.reshape(1,3,h,w) # resize the network's input image size

此行调整网络层以适应形状 (1,3,h,w) 的输入。

Caffe：可变输入图像大小

Caffe: variable input-image size

image-processing

computer-vision

neural-network

deep-learning

caffe