是否有可能在 TF/Keras 中保存 X 个时期后的最佳模型？

Question

我的模型运行非常快，但它们似乎变慢了，因为我正在保存最好的模型（以在另一个进程中加载）；但我注意到保存过程本身会减慢处理速度。在拟合的早期阶段，每次迭代都在改进，它增加了越来越多的延迟。

我想知道是否有办法在 X 个时期后保存最好的模型，或者将其保存在后台，这样模型训练就不会因为保存得太频繁而延迟？

为清楚起见，这就是我运行宁 ModelCheckpoint 在 Keras/TF2 中的方式：

filepath="BestModel.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
# fit the model
model.fit(x, y, epochs=40, batch_size=50, callbacks=callbacks_list)

Answer 1

可以参考ModelCheckpoint callback to control the frequency of saving. By default, it is set to 'epoch' which means it would save the model at the end of each epoch; however, it also could be set to an integer which determines the number of batches to pass to save the model. Here is the relevant part of documentation的save_freq参数：

save_freq: 'epoch' or integer. When using 'epoch', the callback saves the model after each epoch. When using integer, the callback saves the model at end of this many batches. If the Model is compiled with experimental_steps_per_execution=N, then the saving criteria will be checked every Nth batch. Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reflect as little as 1 batch, since the metrics get reset every epoch). Defaults to 'epoch'.

是否有可能在 TF/Keras 中保存 X 个时期后的最佳模型？

Is it possible in TF/Keras to save the best model AFTER X epochs?

python

keras

tensorflow

tf.keras

tensorflow2.0