串联层是否真的在 caffe 中产生 4D 输出？

Does the Concatenation Layer produce really a 4D output in caffe?

我在这里阅读了有关连接层的文档：Layer Catalogue Concat。它指出：

input:

n_i * c_i * h * w for each input blob i from 1 to K.

Output:

if axis = 0: (n_1 + n_2 + ... + n_K) * c_1 * h * w, and all input c_i should be the same.

if axis = 1: n_1 * (c_1 + c_2 + ... + c_K) * h * w, and all input n_i should be the same.

但是，我很难想象这一点，比如当所有层都接受 3D 输入时，怎么会有 4 维输出？是否有某种技巧可以将 4D 输出读取为 3D 输出？

其实输入输出都是4维的：batch维度，通道数，高宽。您可以在特殊情况下获得不同数量的维度（例如 RGB-D 输入的 5D），但对于标准 RGB 图像，到处都保持 4D（全连接层除外）。