Deep Learning Multi-Class Classification，哪个必须相同，instance count or image count？

Question

我正在使用具有 2 个类的 Mask R-CNN 训练模型。车辆和道路。我有一个关于准备数据集的问题。哪个更好，精度更高？

>>> 1 - Having the same number of instance in the whole dataset like:
Car Image: 50 
Total Cars: 500 (each car image has 10 cars)
Road Image: 500 
Total Roads: 500 (each road images has 1 road)
>>> In here the count of roads and cars are equal.

>>> 2 - Having the same number of image in the whole dataset like:
Car Image: 500
Total Cars: 10000 (each car image has 20 cars)
Road Image: 500
Total Roads: 700 (each road images has 1-2 road)
>>> In here the image count of roads and cars are equal.

为了获得更高的准确度，哪个选项更好？谢谢你的时间。

Answer 1

classification 和 mask 网络仅适用于区域提案，与对象计数相关联，因此您应该主要关注汽车和道路的数量。但是你也应该使用尽可能大的数据集。如果你有足够的数据和一个尺寸合适的网络，不平衡的数据集应该不是问题，除非你有一个罕见的 class.

首先尝试使用您的整个数据集，如果您在道路识别方面遇到问题，请查看有关如何处理不平衡数据集的讨论：https://datascience.stackexchange.com/questions/38796/unbalanced-training-data-for-different-classes/38815#38815

Deep Learning Multi-Class Classification，哪个必须相同，instance count or image count？

Deep Learning Multi-Class Classification, Which one must be the same, instance count or image count?

python

artificial-intelligence

machine-learning

deep-learning

data-science