测试数据集上的 Tensorflow 对象检测模型评估

Question

我已经微调了 Model Zoo 上可用的 faster_rcnn_resnet101 模型来检测我的自定义对象。我将数据分为训练集和评估集，并在训练时在配置文件中使用它们。现在训练完成后，我想在一个看不见的数据（我称之为测试数据）上测试我的模型。我使用了几个函数，但无法确定使用 tensorflow 的 API 中的哪些代码来评估测试数据集的性能。以下是我尝试过的事情：

我使用 object_detection/metrics/offline_eval_map_corloc.py 函数对测试数据集进行评估。代码运行很好，但我为大中型边界框设置了负值或 AR 和 AP。

平均精度 (AP) @[ IoU=0.50:0.95 |面积=所有| maxDets=100] = 0.459

平均精度 (AP) @[ IoU=0.50 |面积=所有| maxDets=100] = 0.601

平均精度 (AP) @[ IoU=0.75 |面积=所有| maxDets=100] = 0.543

平均精度 (AP) @[ IoU=0.50:0.95 |面积=小 | maxDets=100] = 0.459

平均精度 (AP) @[ IoU=0.50:0.95 |面积=中| maxDets=100] = -1.000

平均精度 (AP) @[ IoU=0.50:0.95 |面积=大 | maxDets=100] = -1.000

平均召回率 (AR) @[ IoU=0.50:0.95 |面积=所有| maxDets= 1 ] = 0.543

平均召回率 (AR) @[ IoU=0.50:0.95 |面积=所有| maxDets= 10 ] = 0.627

平均召回率 (AR) @[ IoU=0.50:0.95 |面积=所有| maxDets=100] = 0.628

平均召回率 (AR) @[ IoU=0.50:0.95 |面积=小 | maxDets=100] = 0.628

平均召回率 (AR) @[ IoU=0.50:0.95 |面积=中| maxDets=100] = -1.000

平均召回率 (AR) @[ IoU=0.50:0.95 |面积=大 | maxDets=100] = -1.000

现在，我知道 mAP 和 AR 不能为负，而且有问题。我想知道为什么我运行对测试数据集进行离线评估时会看到负值？

我用来运行这个管道的查询是：拆分=测试

echo "
label_map_path: '/training_demo/annotations/label_map.pbtxt'
tf_record_input_reader: { input_path: '/training_demo/Predictions/test.record' }
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt

echo "
metrics_set: 'coco_detection_metrics'
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt 

python object_detection/metrics/offline_eval_map_corloc.py \
  --eval_dir='/training_demo/test_eval_metrics' \
  --eval_config_path='training_demo/test_eval_metrics/test_eval_config.pbtxt' \
  --input_config_path='/training_demo/test_eval_metrics/test_input_config.pbtxt'

我也尝试了 object_detection/legacy/eval.py，但我得到的评估指标值为负值：

DetectionBoxes_Recall/AR@100（中）：-1.0 DetectionBoxes_Recall/AR@100（小）：-1.0 DetectionBoxes_Precision/mAP@.50IOU：-1.0 DetectionBoxes_Precision/mAP（中）：-1.0 等等

我用了管道， python eval.py \ --logtostderr \ --checkpoint_dir=trained-inference-graphs/output_inference_graph/ \ --eval_dir=test_eval_metrics\ --pipeline_config_path=training/faster_rcnn_resnet101_coco-Copy1.config

faster_rcnn_resnet101_coco-Copy1.config中的eval_input_reader指向带有ground truth和检测信息的测试TFRecord。

我也试过object_detection/utils/object_detection_evaluation得到评价。这与使用第一种方法没什么不同，因为它没有用相同的基本函数 - evaluator.evaluate()

如有任何帮助，我将不胜感激。

Answer 1

评价指标为COCO格式，这些值的含义可以参考COCOAPI

如可可中所述api code, -1 is the default value if the category is absent. In your case, all objects detected only belong to 'small' area. Also area categories of 'small', 'medium' and 'large' depend on the pixels the area takes as specified here.

Answer 2

对我来说我只是运行 model_main.py 一次并且把pipeline.config中的eval_input_reader改成test数据集。但我不确定是否应该这样做。

python model_main.py \
    --alsologtostderr \
    --run_once \
    --checkpoint_dir=$path_to_model \
    --model_dir=$path_to_eval \
    --pipeline_config_path=$path_to_config

pipeline.config

eval_config: {
  metrics_set: "coco_detection_metrics"
  num_examples: 721 # no of test images
  num_visualizations: 10 # no of visualizations for tensorboard
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/path/to/test-data.record"
  }
  label_map_path: "/path/to/label_map.pbtxt"
  shuffle: true
  num_readers: 1
}

对我来说，验证数据集和测试数据集之间的 mAP 也没有差异。所以我不确定训练、验证和测试数据是否真的有必要分开。

Answer 3

!python eval.py --logtostderr --pipeline_config_path=--checkpoint_dir--eval_dir=eval/

您可以在 legacy 文件夹中找到 Eval.py

测试数据集上的 Tensorflow 对象检测模型评估

Tensorflow Object detection model evaluation on Test Dataset

python

object-detection

tensorflow

object-detection-api