咖啡 Python | GoogleNet 的低准确性可能是由于输入数据的格式错误造成的？

Question

我对 Caffe 和深度学习还很陌生，所以请原谅我的经验不足和幼稚的问题。现在，我想使用 FER2013 数据集训练 GoogleNet（它由人脸组成，目的是识别人脸属于的 7 个类别之一）。然而，数据不是图像格式，而是 48x48=2304 个值的数组，每个值都在 0 到 255 之间。因此，为了创建提供给 Caffe 所需的 lmdb 文件，我编写了以下 Python将数组转换为真实图像的脚本。

import numpy as np
from PIL import Image
import csv
import itertools

with open('fer2013.csv', 'rb') as f:
    mycsv = csv.reader(f)
    i=0
    for row in itertools.islice(mycsv, 340):
        data = row[1]
        data = data.split()
        data = map(int, data)
        data = np.array(data)
        im = Image.fromarray(data.reshape((48,48)).astype('uint8')*255)
        directory='imagestotest/'
        path_to_save = directory+"image"+str(i)+".jpg"
        path = "image"+str(i)+".jpg"
        im.save(path_to_save)
        i=i+1
        with open("testset.txt", "a") as myfile:
            myfile.write(path+" "+row[0]+"\n")

然后我使用以下命令准备我的 lmdb 文件

GLOG_logtostderr=1 ./deep-learning/caffe/build/tools/convert_imageset --resize_height=256 --resize_width=256 --shuffle /home/panos/Desktop/images/ /home/panos/Desktop/trainingset.txt /home/panos/Desktop/train_lmdb

最后，我计算 image_mean，我更改 train_val.prototxt 并将 loss1、loss2、loss3 层设置为 num_output=7（因为我有 7 类, 0-6).

我运行我的模型（训练规模：5000，测试规模：340），准确率相当令人失望，接近 23%（top-1）、88.8%（top-5）。

这可能是超参数配置问题，还是我的输入文件没有正确创建？（因为我害怕我的 Python 厨艺）

如果有帮助，我的主要超参数是：test_iter: 7,test_interval: 40,base_lr: 0.001,momentum: 0.9,weight_decay: 0.0002 .

提前致谢！

Answer 1

要使用预训练模型，您需要先从 here 下载 googlenet 模型。现在您可以使用此命令：

caffe train —solver solver.prototxt —weights bvlc_googlenet.caffemodel

训练的主要问题之一是权重初始化。如果没有适当的初始化，模型可能无法收敛，表现不佳。您不能使用相同的值初始化所有权重。其他一些建议是权重应该是稀疏的、正交的、归一化的等。因此通常建议使用来自预训练模型的权重。它可以称为迁移学习。权重初始化的细节可以看this by karpathy. You can also see What are good initial weights in a neural network?如果想更深入的理解可以看下面的论文

[1] 本吉奥，约书亚。 "Practical recommendations for gradient-based training of deep architectures." 神经网络：交易技巧。 Springer Berlin Heidelberg, 2012. 437-478.

[2] LeCun, Y.、Bottou, L.、Orr, G. B. 和 Muller, K. (1998a)。高效的反向传播。在神经网络中，交易技巧。

[3] Glorot、Xavier 和 Yoshua Bengio。 "Understanding the difficulty of training deep feedforward neural networks." 人工智能与统计学国际会议。 2010.

咖啡 Python | GoogleNet 的低准确性可能是由于输入数据的格式错误造成的？

Caffe Python | Low accuracy of GoogleNet could be caused by bad form of input data?

python

deep-learning

caffe