如何确定卷积神经网络的架构

How determine architecture of a Convolution Neuronal Network

如何确定卷积神经网络的架构

我正在研究计算机视觉中的深度学习。

我阅读了很多有关神经网络、反向传播、随机梯度下降、过度拟合、正则化等工作原理的文章。有'hard'数学规则：很容易理解。

但是，我怎么知道我的卷积神经网络需要什么样的架构？例如：我想对这些植物进行分类： http://www.biohof-waldegg.ch/Bilder/Blacke%201%20(Individuell).JPG

我用mnist数据库（手写数字数据库）研究过例子 - 为什么使用这些架构的最多示例：Conv 5x5 -> Pooling(2,max) -> Conv5x5？我已经绘制了第一个隐藏层的权重，但是图像过滤器看起来不太好对我来说很熟悉（既不像用于边缘检测的高通滤波器，也不像低通滤波器）

在一个层中添加更多的特征图还是添加更多的隐藏层更好？
如何确定网络是否太深/太浅
我如何确定图层的特征图是否太多/太少？
如何确定内核大小是否太大/太小？
当我选择 conv -> conv -> pooling 而不是 conv -> pooling -> conv 时？
步幅参数有什么影响？（我知道这个参数的作用，但不知道何时以及如何调整这些参数？
有没有办法检查哪个层检测到的特征？（例如边缘/颜色/形状）

对于如何构建神经网络（或 CNN），没有经过验证的硬性规则。这是一个悬而未决的问题。

why use the most examples these architectures: Conv 5x5 -> Pooling(2,max) -> Conv5x5

事实并非如此。大多数架构使用 3x3 池化，因为后续池化层将感知域扩展到任意大小。根据经验，一些研究人员（例如 Rethinking the Inception Architecture for Computer Vision）发现这些效果更好。

how can I determine if the Network is too deep / too shallow

推理太慢->网络太深
精度太低 -> 深度可以帮助

how can I determine if the kernel size is too big / too small?

默认使用 3x3。原因见Rethinking the Inception Architecture for Computer Vision。

When do I chose conv -> conv -> pooling instead of conv -> pooling -> conv?

我宁愿写conv -> conv -> pooling而不是conv -> pooling，因此问题是"how do I determine how many subsequent convolutional layers I should have. Again, this is an open problem. Most people choose 2 or 3 subsequent convolutional layer, but at the end it seems to boil down to "试一试。（如果有更工程化的方法，请告诉我！）

What for an impact has the stride parameter?

步幅减小了输出特征图的大小。因此它大大减少了内存占用 (* 1/stride^2)。

is there a way to check which features a Layer is detecting?

齐勒和弗格斯：Visualizing and Understanding Convolutional Networks

如何确定卷积神经网络的架构

How determine architecture of a Convolution Neuronal Network

computer-vision

deep-learning