当前位置: X-MOL 学术Comput. Vis. Image Underst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A progressive learning framework based on single-instance annotation for weakly supervised object detection
Computer Vision and Image Understanding ( IF 4.5 ) Pub Date : 2020-01-15 , DOI: 10.1016/j.cviu.2020.102903
Ming Zhang , Bing Zeng

Fully-supervised object detection (FSOD) and weakly-supervised object detection (WSOD) are two extremes in the field of object detection. The former relies entirely on detailed bounding-box annotations while the later discards them completely. To balance these two extremes, we propose to make use of the so-called single-instance annotations, i.e., all images that contain only a single object are labelled with the corresponding bounding-boxes. By using such instance annotations of the simplest images, we propose a progressive learning framework that integrates image-level learning, single-instance learning, and multi-instance learning into an end-to-end network. Specifically, our framework is composed of three parallel streams that share a proposal feature extractor. The first stream is supervised by image-level annotations, which provides global information of all training data for the shared feature extractor. The second stream is supervised by single-instance annotations to bridge the features learning gap between the image level and instance level. To further learn from complex images, we propose an overlap-based instance mining algorithm to mine pseudo multi-instance annotations from the detection results of the second stream, and use them to supervise the third stream. Our method achieves a trade-off between the detection accuracy and annotation cost. Extensive experiments demonstrate the effectiveness of our proposed method on the PASCAL VOC and MS-COCO dataset, implying that a few single-instance annotations can improve the detection performance of WSOD significantly (more than 10%) and reduce the average annotation cost of FSOD greatly (more than 5 times).



中文翻译:

基于单实例注释的渐进式学习框架,用于弱监督对象检测

完全监督对象检测(FSOD)和弱监督对象检测(WSOD)是对象检测领域中的两个极端。前者完全依赖详细的边界框注释,而后者则完全丢弃它们。为了平衡这两个极端,我们建议使用所谓的单实例注释,即,仅包含单个对象的所有图像都用相应的边界框标记。通过使用最简单图像的此类实例注释,我们提出了一种渐进式学习框架,该框架将图像级学习,单实例学习和多实例学习集成到了端到端网络。具体来说,我们的框架由共享建议特征提取器的三个并行流组成。第一个视频流由图片级注释进行监督,为共享特征提取器提供所有训练数据的全局信息。第二个流由单实例注释监督,以弥合图像级别和实例级别之间的特征学习差距。为了进一步从复杂图像中学习,我们提出了一种基于重叠的实例挖掘算法,用于从第二个流的检测结果中挖掘伪多实例注释,并使用它们来监督第三个流。我们的方法实现了检测精度和注释成本之间的折衷。大量实验证明了我们提出的方法在PASCAL VOC和MS-COCO数据集上的有效性,这意味着一些单实例注释可以显着提高WSOD的检测性能(超过10%)并大大降低FSOD的平均注释成本(超过5次)。

更新日期:2020-01-15
down
wechat
bug