我的 CNN 中的激活看起来不正确 - 或者是热图的问题？

Question

我正在为通过 Keras 制作的卷积神经网络生成热图，as described here。当我运行香草 VGG16 网络的算法时，热图看起来不错：

然后我基于 VGG16 网络创建了自己的自定义模型，但具有自定义顶层：

input_layer = layers.Input(shape=(img_size, img_size, 3), name="model_input")
vgg16_base = VGG16(weights="imagenet", include_top=False, input_tensor=input_layer)
temp_model = vgg16_base.output
temp_model = layers.Flatten()(temp_model)
temp_model = layers.Dense(256, activation="relu")(temp_model)
temp_model = layers.Dense(1, activation="sigmoid")(temp_model)
custom = models.Model(inputs=input_layer, outputs=temp_model)

然而，当我为我自己的自定义网络的同一层生成热图时（即来自 VGG16 基础的最后一个转换层，是我的新网络的一部分），使用完全相同的 code/function，热图看起来不对：

我的自定义网络的 validation/testing 准确率为 97-98%，所以我认为它工作正常。为什么 activation/heatmap 这么差？还是我在这里错过了其他东西？

PS：供您参考，热图是通过 function listed here 创建的。它的名字是这样的：

# Load the image from disk and preprocess it via Keras tools
img_path = "/path/to/image.jpg"
img = image.load_img(img_path, target_size=(224, 224))
img_tensor = image.img_to_array(img)
img_tensor = np.expand_dims(img_tensor, axis=0)
img_tensor = preprocess_input(img_tensor)

# At this point I either load the VGG16 model directly (heatmapü works),
# or I create my own custom VGG16-based model (heatmap does not work)
# The model itself is then stored into the variable "model"

preds = model.predict(img_tensor)
model_prediction = model.output[:, np.argmax(preds[0])]

# Then I call the custom function referred to above
input_layer = model.get_layer("model_input")
conv_layer = model.get_layer("block5_conv3")
plot_conv_heat_map(model_prediction, input_layer, conv_layer, img_tensor, img_path)

Answer 1

简答：由于你在训练阶段将狗标记为1，将猫标记为0，你需要将model_prediction替换为1 - model_prediction查找与猫相关的区域：

plot_conv_heat_map(1 - model_prediction, ...)

长答案： 当你使用原始 VGG 模型时，最后一层有 1000 个神经元（假设你使用 pre-trained ImageNet 模型），一个对于 1000 个不同的 classes 中的每一个：

# last layer in VGG model
x = layers.Dense(classes, activation='softmax', name='predictions')(x)

这些神经元中的每一个都有一个从零到一的输出值（条件是输出的总和必须是一）。因此，最活跃的神经元（即输出最高的神经元）对应于预测的 class。所以你发现它是这样的：

model_prediction = model.output[:, np.argmax(preds[0])]
                                        \
                                         \___ finds the index of the neuron with maximum output

然后将其传递给可视化函数以计算其相对于所选卷积层的梯度并可视化热图：

plot_conv_heat_map(model_prediction, ...)

到目前为止，还不错。但是，在您的自定义模型中，您已将问题从 multi-class class 化任务转换为二进制 class 化任务，即狗与猫。您正在使用一个带有一个单元的 sigmoid 层作为最后一层，并将神经元的活动状态（即接近 1 的输出）视为狗，将神经元的非活动状态（即接近 0 的输出）视为猫。所以你的网络本质上是一个狗检测器，如果没有狗，那么我们假设图像中有一只猫。

好吧，你可能会问"what's the problem with that?"答案是模型的训练没有问题，正如你所建议的，你得到了很好的训练准确率。但是，请记住可视化函数背后的假设：它将具有最高输出的神经元作为输入，对应于图像中检测到的 class。所以，给你的自定义模型一个猫图像，最后一层的输出将是一个非常低的数字，比如 0.01。所以对该数字的一种解释是这张图片是狗的概率是 0.01。那么，当您将它直接提供给您的可视化功能时会发生什么？是的，您猜对了：它会找到图像中与狗最相关的所有区域！您可能仍然反对 "But I have given it a cat image!!!" 没关系，因为当有狗存在时该神经元会激活，因此当您采用其相对于卷积层的梯度时，与狗最相关的区域将得到高度表示和显示在热图中。但是，如果您给模型一个狗图像，可视化将是正确的。

"So what should we do when when want to visualize the areas most relevant to a cat?" 很简单：只需将该神经元设为猫检测器即可。 "How?" 只需创建其补集：1 - model_prediction。这使您有可能在图像中出现一只猫。您可以像这样轻松地使用它来绘制图像中的 cat-relevant 区域：

plot_conv_heat_map(1 - model_prediction, ...)

或者，您可以将模型的最后一层更改为具有 2 个具有 softmax 激活的神经元，然后 re-train 它：

temp_model = layers.Dense(2, activation="softmax")(temp_model)

这样每个 classes，即狗和猫，都会有自己的神经元，因此在可视化激活时不会出现问题 heat-map。

我的 CNN 中的激活看起来不正确 - 或者是热图的问题？

The activation in my CNN does not look correct - or is the heatmap the problem?

python

machine-learning

heatmap

conv-neural-network

keras