将 .caffemodel 转换为 keras h5

Question

我想使用我自己的数据集微调在 caffe 中训练的性别检测器。该模型是通过使用大约 50 万张人脸图像进行训练的。他们微调了在 ImageNet 上预训练的 VGG16。我想使用这个模型作为我的基础 model.I 从 link:

下载 gender.caffemodel 文件

我已经使用下面 link 中提供的工具将此模型转换为 h5 文件以在 Keras 中使用：

https://github.com/pierluigiferrari/caffe_weight_converter

此工具仅转换权重。我想使用 Keras 来训练我的模型。所以，我这样定义 VGG-16 架构：

tmp_model = Sequential()
tmp_model.add(ZeroPadding2D((1,1),input_shape=(224, 224, 3)))
tmp_model.add(Convolution2D(64, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(64, 3, 3, activation='relu'))
tmp_model.add(MaxPooling2D((2,2), strides=(2,2)))

tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(128, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(128, 3, 3, activation='relu'))
tmp_model.add(MaxPooling2D((2,2), strides=(2,2)))

tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(256, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(256, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(256, 3, 3, activation='relu'))
tmp_model.add(MaxPooling2D((2,2), strides=(2,2)))

tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(512, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(512, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(512, 3, 3, activation='relu'))
tmp_model.add(MaxPooling2D((2,2), strides=(2,2)))

tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(512, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(512, 3, 3, activation='relu'))
tmp_model.add(ZeroPadding2D((1,1)))
tmp_model.add(Convolution2D(512, 3, 3, activation='relu'))
tmp_model.add(MaxPooling2D((2,2), strides=(2,2)))

tmp_model.add(Flatten())
tmp_model.add(Dense(4096, activation='relu'))
tmp_model.add(Dropout(0.5))
tmp_model.add(Dense(4096, activation='relu'))
tmp_model.add(Dropout(0.5))
tmp_model.add(Dense(2, activation='softmax'))
tmp_model.load_weights('/home/gender.h5')

此代码成功加载权重。现在我想使用这个模型的权重来微调另一个网络以用于其他一些分类任务，具有不同数量的类。由于类的数量与 tmp_model 中的不同，我将权重从 tmp_model 复制到新模型的层，最后一层除外，即 softmax。新模型的代码与 tmp_model 完全相同，除了最后一层。现在我所做的是将权重从 tmp_model 逐层复制到新模型：

for i, weights in enumerate(weights_list[0:31]):
    model.layers[i].set_weights(weights)

问题就出现在这里。当我运行我的代码时，它给了我这个错误：

ValueError: You called `set_weights(weights)` on layer "zero_padding2d_14" with a  weight list of length 3, but the layer was expecting 0 weights. Provided weights: [[[[ 0.27716994  0.05686508  0.0098957  ... -0.055...

正如我所说，tmp_model 和模型具有完全相同的架构，除了最后一层。这就是为什么我只是复制所有层的权重，最后一层除外。我做错了什么？

Answer 1

在我看来，weights_list只包含包含权重的层，通常它只有16层，因为VGG16有16层包含权重。但是 model.layers 的范围是从 0 到 N，这里 N 大于 16，因为 model.layers 包括不包含权重的层，例如 relu 层，padding_layer 和 max_pooling 层，依此类推

将 .caffemodel 转换为 keras h5

Converting a .caffemodel to keras h5

machine-learning

deep-learning

caffe

keras