用于对象检测的归一化标签框 API tensorflow

Question

当我使用 labelImg (https://github.com/tzutalin/labelImg) 在我的对象周围绘制边界框以输出 annotation.xml 文件时，它给出了边界框的坐标。我使用这些注释输入张量流中的对象检测模型（ssd_mobilenet_v1_coco & faster_rcnn_resnet101_coco）。预测的输出 (xmin, ymin, xmax, ymax) 从 0 - 1。

我的 annotation.xml 中的输入是否标准化为 0 - 1？我想知道这一点，因为我想通过将基本事实和预测的边界框输入到我自己的 IOU 函数中来获得 IOU。谢谢

Answer 1

基本上，如果您为模型提供 tf.record 文件，它包含您的图像和边界框的标准化坐标。因此，您从 .xml 文件转换为 tf.record 文件也会标准化您的边界框坐标。

您的模型的输出也将采用标准化坐标。您可以通过乘以图像大小轻松地重新缩放它们：

x_min_abs = x_min_rel * image_width
x_max_abs = x_max_rel * image_width
y_min_abs = y_min_rel * image_height
y_max_abs = y_max_rel * image_height

用于对象检测的归一化标签框 API tensorflow

Normalized label box for object detection API tensorflow

object-detection

tensorflow