Python model.fit_generator 卡在第一个纪元并尝试计算 "unknown" 步数

Python model.fit_generator gets stuck on first epoch and tries to compute "unknown" number of steps

我对 python 和图像分类模型还很陌生,但我有一些 tensorflow 代码直到最近都运行良好。当我到达这部分代码时,我 运行 突然陷入了一个问题。我正在 运行 浏览 google colab notebooks。

epochs = 5

history = model.fit_generator(train_generator, 
                    epochs=epochs,                 
                    validation_data=val_generator)

fit_generator 无法计算每个时期的步数并将其列为未知。然后,第一个纪元会不停地继续,如果我离开它足够长的时间,准确度会慢慢上升到 1。

Epoch 1/5
    325/Unknown - 992s 3s/step - loss: 0.2221 - accuracy: 0.9318

有没有人知道什么会导致它每个时期的步数未知并且永远不会超过第 1 个时期?

以下是代码中可能相关的更多信息(训练大小为 1602,测试大小为 395,有 11 个不同 类):

Found 1602 images belonging to 11 classes.
Found 395 images belonging to 11 classes.

批量大小设置为 64

for image_batch, label_batch in train_generator:
  break
image_batch.shape, label_batch.shape
((64, 224, 224, 3), (64, 11))
IMG_SHAPE = (IMAGE_SIZE, IMAGE_SIZE, 3)

# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                              include_top=False, 
                                              weights='imagenet')
base_model.trainable = False
model = tf.keras.Sequential([
  base_model,
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.GlobalAveragePooling2D(),
  tf.keras.layers.Dense(11, activation='softmax')
])

模型总结

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
mobilenetv2_1.00_224 (Model) (None, 7, 7, 1280)        2257984   
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 5, 5, 32)          368672    
_________________________________________________________________
dropout_2 (Dropout)          (None, 5, 5, 32)          0         
_________________________________________________________________
global_average_pooling2d_2 ( (None, 32)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 11)                363       
=================================================================
Total params: 2,627,019
Trainable params: 369,035
Non-trainable params: 2,257,984

您应该将 steps_per_epochvalidation_steps 参数传递给您的 fit_generator 函数,让模型知道有多少批次用于训练和验证集。

这些参数的值通常是样本数除以批量大小。在这种情况下:

steps_per_epoch = 1602//64
validation_steps = 395//64

然后:

model.fit_generator(train_generator, 
                epochs=epochs,                 
                validation_data=val_generator,steps_per_epoch=1602//64, validation_steps=395//64)

这可能是与多处理相关的问题。您也可以尝试设置 workers=1use_multiprocessing=False。它对我有用。