Multi-peak Graph-based Multi-instance Learning for Weakly Supervised Object Detection,ACM Transactions on Multimedia Computing, Communications, and Applications

当前位置： X-MOL 学术 › ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-peak Graph-based Multi-instance Learning for Weakly Supervised Object Detection
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.2 ) Pub Date : 2021-06-14 , DOI: 10.1145/3432861
Ruyi Ji ₁ , Zeyu Liu ₂ , Libo Zhang ₃ , Jianwei Liu ₂ , Xin Zuo ₂ , Yanjun Wu ₃ , Chen Zhao ₃ , Haofeng Wang ₄ , Lin Yang ₅

Affiliation

Weakly supervised object detection (WSOD), aiming to detect objects with only image-level annotations, has become one of the research hotspots over the past few years. Recently, much effort has been devoted to WSOD for the simple yet effective architecture and remarkable improvements have been achieved. Existing approaches using multiple-instance learning usually pay more attention to the proposals individually, ignoring relation information between proposals. Besides, to obtain pseudo-ground-truth boxes for WSOD, MIL-based methods tend to select the region with the highest confidence score and regard those with small overlap as background category, which leads to mislabeled instances. As a result, these methods suffer from mislabeling instances and lacking relations between proposals, degrading the performance of WSOD. To tackle these issues, this article introduces a multi-peak graph-based model for WSOD. Specifically, we use the instance graph to model the relations between proposals, which reinforces multiple-instance learning process. In addition, a multi-peak discovery strategy is designed to avert mislabeling instances. The proposed model is trained by stochastic gradients decent optimizer using back-propagation in an end-to-end manner. Extensive quantitative and qualitative evaluations on two publicly challenging benchmarks, PASCAL VOC 2007 and PASCAL VOC 2012, demonstrate the superiority and effectiveness of the proposed approach.

中文翻译：

用于弱监督目标检测的基于多峰图的多实例学习

弱监督对象检测（WSOD），旨在检测仅具有图像级注释的对象，已成为过去几年的研究热点之一。最近，为了简单而有效的架构，WSOD投入了大量精力，并取得了显着的改进。使用多实例学习的现有方法通常更关注单个提案，而忽略提案之间的关系信息。此外，为了获得 WSOD 的伪真实框，基于 MIL 的方法倾向于选择置信度得分最高的区域，并将那些重叠小的区域作为背景类别，这会导致错误标记的实例。结果，这些方法存在错误标记实例和提案之间缺乏关系的问题，从而降低了 WSOD 的性能。为了解决这些问题，本文介绍了一种基于多峰图的 WSOD 模型。具体来说，我们使用实例图对提案之间的关系进行建模，从而加强了多实例学习过程。此外，多峰发现策略旨在避免错误标记实例。所提出的模型由随机梯度优化器以端到端的方式使用反向传播进行训练。对两个具有公开挑战性的基准 PASCAL VOC 2007 和 PASCAL VOC 2012 的广泛定量和定性评估证明了所提出方法的优越性和有效性。多峰发现策略旨在避免错误标记实例。所提出的模型由随机梯度优化器以端到端的方式使用反向传播进行训练。对两个具有公开挑战性的基准 PASCAL VOC 2007 和 PASCAL VOC 2012 的广泛定量和定性评估证明了所提出方法的优越性和有效性。多峰发现策略旨在避免错误标记实例。所提出的模型由随机梯度优化器以端到端的方式使用反向传播进行训练。对两个具有公开挑战性的基准 PASCAL VOC 2007 和 PASCAL VOC 2012 的广泛定量和定性评估证明了所提出方法的优越性和有效性。

更新日期：2021-06-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文