concat 层之后的 googlenet 中的输出维度是多少?
What is the output dimension in the googlenet after the concat layer?
当查看 googlenet 的 prototxt 时,发现初始层在末尾有一个 concat 层,它接受多个底部输入。
例如:
layer {
name: "inception_3a/output"
type: "Concat"
bottom: "inception_3a/1x1"
bottom: "inception_3a/3x3"
bottom: "inception_3a/5x5"
bottom: "inception_3a/pool_proj"
top: "inception_3a/output"
}
可以看出,有一个 1x1 conv-layer,一个 3x3 conv-layer,一个 5x5 conv-layer,最后是一个池化层。这些层描述如下:
layer {
name: "inception_3a/1x1"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_1x1"
type: "ReLU"
bottom: "inception_3a/1x1"
top: "inception_3a/1x1"
}
layer {
name: "inception_3a/3x3_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3_reduce"
}
layer {
name: "inception_3a/3x3"
type: "Convolution"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3"
type: "ReLU"
bottom: "inception_3a/3x3"
top: "inception_3a/3x3"
}
layer {
name: "inception_3a/5x5_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5_reduce"
}
layer {
name: "inception_3a/5x5"
type: "Convolution"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5"
type: "ReLU"
bottom: "inception_3a/5x5"
top: "inception_3a/5x5"
}
layer {
name: "inception_3a/pool"
type: "Pooling"
bottom: "pool2/3x3_s2"
top: "inception_3a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_3a/pool_proj"
type: "Convolution"
bottom: "inception_3a/pool"
top: "inception_3a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
可以看出它们有不同的输出数量和不同的过滤器大小,总之concat层的文档如下:
input:
n_i * c_i * h * w for each input blob i from 1 to K.
Output:
if axis = 0: (n_1 + n_2 + ... + n_K) * c_1 * h * w
, and all input c_i
should be the same.
if axis = 1: n_1 * (c_1 + c_2 + ... + c_K) * h * w
, and all input n_i should be the same.
首先,我不确定默认值是多少,其次我不确定哪个维度将具有输出体积,因为宽度和高度应该保持不变,但所有三个转换层都会产生不同的输出。任何指针将不胜感激
'Concat' 轴的默认值为 1,因此通过通道维度连接。为了做到这一点,所有连接的层都应该具有相同的高度和宽度。查看日志,尺寸为(假设批量大小为 32):
inception_3a/1x1 -> [32, 64, 28, 28]
inception_3a/3x3 -> [32, 128, 28, 28]
inception_3a/5x5 -> [32, 32, 28, 28]
inception_3a/pool_proj -> [32, 32, 28, 28]
因此最终输出的维度为:
inception_3a/output -> [32 (64+128+32+32) 28, 28] -> [32, 256, 28, 28]
正如 Caffe 日志所预期的那样:
Creating Layer inception_3a/output
inception_3a/output <- inception_3a/1x1
inception_3a/output <- inception_3a/3x3
inception_3a/output <- inception_3a/5x5
inception_3a/output <- inception_3a/pool_proj
inception_3a/output -> inception_3a/output
Setting up inception_3a/output
Top shape: 32 256 28 28 (6422528)
当查看 googlenet 的 prototxt 时,发现初始层在末尾有一个 concat 层,它接受多个底部输入。
例如:
layer {
name: "inception_3a/output"
type: "Concat"
bottom: "inception_3a/1x1"
bottom: "inception_3a/3x3"
bottom: "inception_3a/5x5"
bottom: "inception_3a/pool_proj"
top: "inception_3a/output"
}
可以看出,有一个 1x1 conv-layer,一个 3x3 conv-layer,一个 5x5 conv-layer,最后是一个池化层。这些层描述如下:
layer {
name: "inception_3a/1x1"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_1x1"
type: "ReLU"
bottom: "inception_3a/1x1"
top: "inception_3a/1x1"
}
layer {
name: "inception_3a/3x3_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3_reduce"
}
layer {
name: "inception_3a/3x3"
type: "Convolution"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_3x3"
type: "ReLU"
bottom: "inception_3a/3x3"
top: "inception_3a/3x3"
}
layer {
name: "inception_3a/5x5_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5_reduce"
}
layer {
name: "inception_3a/5x5"
type: "Convolution"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5"
type: "ReLU"
bottom: "inception_3a/5x5"
top: "inception_3a/5x5"
}
layer {
name: "inception_3a/pool"
type: "Pooling"
bottom: "pool2/3x3_s2"
top: "inception_3a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_3a/pool_proj"
type: "Convolution"
bottom: "inception_3a/pool"
top: "inception_3a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
可以看出它们有不同的输出数量和不同的过滤器大小,总之concat层的文档如下:
input:
n_i * c_i * h * w for each input blob i from 1 to K.
Output:
if axis = 0:
(n_1 + n_2 + ... + n_K) * c_1 * h * w
, and all input c_i should be the same.if axis = 1:
n_1 * (c_1 + c_2 + ... + c_K) * h * w
, and all input n_i should be the same.
首先,我不确定默认值是多少,其次我不确定哪个维度将具有输出体积,因为宽度和高度应该保持不变,但所有三个转换层都会产生不同的输出。任何指针将不胜感激
'Concat' 轴的默认值为 1,因此通过通道维度连接。为了做到这一点,所有连接的层都应该具有相同的高度和宽度。查看日志,尺寸为(假设批量大小为 32):
inception_3a/1x1 -> [32, 64, 28, 28]
inception_3a/3x3 -> [32, 128, 28, 28]
inception_3a/5x5 -> [32, 32, 28, 28]
inception_3a/pool_proj -> [32, 32, 28, 28]
因此最终输出的维度为:
inception_3a/output -> [32 (64+128+32+32) 28, 28] -> [32, 256, 28, 28]
正如 Caffe 日志所预期的那样:
Creating Layer inception_3a/output
inception_3a/output <- inception_3a/1x1
inception_3a/output <- inception_3a/3x3
inception_3a/output <- inception_3a/5x5
inception_3a/output <- inception_3a/pool_proj
inception_3a/output -> inception_3a/output
Setting up inception_3a/output
Top shape: 32 256 28 28 (6422528)