Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object Detectors,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object Detectors
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-04-05 , DOI: arxiv-2004.02877
Ali Borji

Object detection remains as one of the most notorious open problems in computer vision. Despite large strides in accuracy in recent years, modern object detectors have started to saturate on popular benchmarks raising the question of how far we can reach with deep learning tools and tricks. Here, by employing 2 state-of-the-art object detection benchmarks, and analyzing more than 15 models over 4 large scale datasets, we I) carefully determine the upper bound in AP, which is 91.6% on VOC (test2007), 78.2% on COCO (val2017), and 58.9% on OpenImages V4 (validation), regardless of the IOU threshold. These numbers are much better than the mAP of the best model (47.9% on VOC, and 46.9% on COCO; IOUs=.5:.05:.95), II) characterize the sources of errors in object detectors, in a novel and intuitive way, and find that classification error (confusion with other classes and misses) explains the largest fraction of errors and weighs more than localization and duplicate errors, and III) analyze the invariance properties of models when surrounding context of an object is removed, when an object is placed in an incongruent background, and when images are blurred or flipped vertically. We find that models generate a lot of boxes on empty regions and that context is more important for detecting small objects than larger ones. Our work taps into the tight relationship between object detection and object recognition and offers insights for building better models. Our code is publicly available at https://github.com/aliborji/Deetctionupper bound.git.

中文翻译：

现代目标检测器的经验上限、误差诊断和不变性分析

目标检测仍然是计算机视觉中最臭名昭著的开放问题之一。尽管近年来在准确性方面取得了长足的进步，但现代对象检测器已经开始在流行的基准测试中饱和，这引发了我们可以使用深度学习工具和技巧达到多远的问题。在这里，通过使用 2 个最先进的对象检测基准，并分析超过 4 个大规模数据集的 15 个以上模型，我们 I) 仔细确定了 AP 的上限，即 VOC 的 91.6% (test2007), 78.2 COCO (val2017) 上的百分比和 OpenImages V4（验证）上的 58.9%，无论 IOU 阈值如何。这些数字比最佳模型的 mAP 好得多（VOC 为 47.9%，COCO 为 46.9%；IOUs=.5:.05:.95），II）在小说中描述了物体检测器中的错误来源和直观的方式，并发现分类错误（与其他类别的混淆和未命中）解释了最大部分的错误，并且比定位和重复错误更重要，并且 III）分析当对象的周围上下文被移除时模型的不变性属性，当对象是放置在不一致的背景中，以及图像模糊或垂直翻转时。我们发现模型在空白区域生成了很多框，并且上下文对于检测小物体比大物体更重要。我们的工作利用了对象检测和对象识别之间的紧密关系，并为构建更好的模型提供了见解。我们的代码可在 https://github.com/aliborji/Deetctionupper bound.git 上公开获得。当对象放置在不一致的背景中时，以及图像模糊或垂直翻转时。我们发现模型在空白区域生成了很多框，并且上下文对于检测小物体比大物体更重要。我们的工作利用了对象检测和对象识别之间的紧密关系，并为构建更好的模型提供了见解。我们的代码可在 https://github.com/aliborji/Deetctionupper bound.git 上公开获得。当对象放置在不一致的背景中时，以及图像模糊或垂直翻转时。我们发现模型在空白区域生成了很多框，并且上下文对于检测小物体比大物体更重要。我们的工作利用了对象检测和对象识别之间的紧密关系，并为构建更好的模型提供了见解。我们的代码可在 https://github.com/aliborji/Deetctionupper bound.git 上公开获得。

更新日期：2020-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>