使用 yolo4.cfg 的 tensorflow 2.2 训练中的形状不匹配问题

Shape mismatch problem in tensorflow 2.2 training using yolo4.cfg

我最近在我的 yolov3 implementation 中添加了一个新功能,即为了方便,模型当前直接从 DarkNet cfg 文件加载,我使用 yolov3 配置和 yolov4 配置测试了代码,它们都工作正常,除了用于 v4 训练。在我开始训练后不久,我得到了一个形状不匹配的错误,如果有人能帮助我摆脱错误并最终完成我的项目,我将不胜感激。请在评论中告诉我,我将为您提供帮助我解决问题所需的任何资源,在此先感谢您...

这就是我运行为了重现:

if __name__ == '__main__':
    tr = Trainer((608, 608, 3),
                 '../Config/yolo4.cfg',
                 '../Config/beverly_hills.txt',
                 1344, 756, score_threshold=0.1,
                 train_tf_record='../Data/TFRecords/beverly_hills_train.tfrecord',
                 valid_tf_record='../Data/TFRecords/beverly_hills_test.tfrecord')

    tr.train(
        100,
        8,
        1e-3,
        dataset_name='beverly_hills',
        merge_evaluation=False,
        n_epoch_eval=10,
        clear_outputs=True
    )
L

您需要的文件链接:

错误信息如下:

Traceback (most recent call last):
  File "trainer.py", line 629, in <module>
    clear_outputs=True
  File "../Helpers/utils.py", line 62, in wrapper
    result = func(*args, **kwargs)
  File "trainer.py", line 490, in train
    validation_data=valid_dataset,
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1090, in fit
    tmp_logs = train_function(iterator)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 766, in __call__
    result = self._call(*args, **kwds)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 826, in _call
    return self._stateless_fn(*args, **kwds)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2811, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1838, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1914, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 549, in call
    ctx=ctx)
  File "/root/.local/lib/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [4,76,76,3,1] vs. [4,19,19,3,1]
     [[node yolo_loss/logistic_loss/mul (defined at ../Helpers/utils.py:260) ]] [Op:__inference_train_function_38735]

Errors may have originated from an input operation.
Input Source operations connected to node yolo_loss/logistic_loss/mul:
 yolo_loss/split_1 (defined at ../Helpers/utils.py:222) 
 yolo_loss/split (defined at ../Helpers/utils.py:196)

Function call stack:
train_function

当我将 batch_size 更改为 8 而不是 4 时,错误变为以下内容(错误源更改):

Traceback (most recent call last):
  File "/Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Main/trainer.py", line 693, in <module>
    clear_outputs=True,
  File "/Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Helpers/utils.py", line 62, in wrapper
    result = func(*args, **kwargs)
  File "/Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Main/trainer.py", line 526, in train
    validation_data=valid_dataset,
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 66, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 848, in fit
    tmp_logs = train_function(iterator)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1665, in _filtered_call
    self.captured_inputs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1746, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 598, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [8,13,13,3,2] vs. [8,52,52,3,2]
     [[node gradient_tape/yolo_loss/sub_5/BroadcastGradientArgs (defined at Users/emadboctor/Desktop/Code/yolov3-keras-tf2/Main/trainer.py:526) ]] [Op:__inference_train_function_42744]

Function call stack:
train_function

models.py 中添加这一行解决了形状问题并且训练按预期开始:

if '4' in self.model_configuration:
    self.output_layers.reverse()