为什么重量只能在训练中使用?

Why are the weights only usable in training?

在调用 fit 函数后,我可以看到模型在训练中收敛,但在我调用 evaluate 方法后,它的行为就好像模型根本没有完成拟合一样。最好的例子是下面我使用训练生成器进行训练和验证并得到不同结果的地方。

import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint

from ImageGenerator import ImageGenerator

if __name__== "__main__":

    batch_size=64

    train_gen = ImageGenerator('synthetic3/train/open/*.png', 'synthetic3/train/closed/*.png', batch_size=batch_size)

    model = tf.keras.applications.mobilenet_v2.MobileNetV2(weights=None, classes=2, input_shape=(256, 256, 3))

    model.compile(optimizer='adam', 
                loss=tf.keras.losses.CategoricalCrossentropy(),
                metrics=['accuracy'])

    history = model.fit(
        train_gen,
        validation_data=train_gen,
        epochs=5,
        verbose=1
    )
    
    model.evaluate(train_gen)

结果

Epoch 1/5
19/19 [==============================] - 11s 600ms/step - loss: 0.7707 - accuracy: 0.5016 - val_loss: 0.6932 - val_accuracy: 0.5016
Epoch 2/5
19/19 [==============================] - 10s 533ms/step - loss: 0.6991 - accuracy: 0.5855 - val_loss: 0.6935 - val_accuracy: 0.4975
Epoch 3/5
19/19 [==============================] - 10s 509ms/step - loss: 0.6213 - accuracy: 0.6637 - val_loss: 0.6932 - val_accuracy: 0.4992
Epoch 4/5
19/19 [==============================] - 10s 514ms/step - loss: 0.4407 - accuracy: 0.8158 - val_loss: 0.6934 - val_accuracy: 0.5008
Epoch 5/5
19/19 [==============================] - 10s 504ms/step - loss: 0.3200 - accuracy: 0.8643 - val_loss: 0.6949 - val_accuracy: 0.5000
19/19 [==============================] - 3s 159ms/step - loss: 0.6953 - accuracy: 0.4967

这是有问题的,因为即使在保存权重时,它也会像模型没有完成拟合一样保存。

evaluate() 函数将验证数据集作为输入来评估已训练的模型。

从外观上看,您正在为 validation_data 使用训练数据集 (train_gen),并将相同的数据集作为输入传递给 model.evaluate()

大家好,经过多日的痛苦终于找到了解决这个问题的方法。这是由于模型中的批量归一化层。如果您计划将训练作为自定义数据集,则需要根据您的批量大小更改动量参数。

for layer in model.layers:
    if type(layer)==type(tf.keras.layers.BatchNormalization()):
        # renorm=True, Can have renomalization for smaller batch sizes
        layer.momentum=new_momentum

来源: https://github.com/tensorflow/tensorflow/issues/36065