PML-LocNet: Improving Object Localization with Prior-induced Multi-view Learning Network.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PML-LocNet: Improving Object Localization with Prior-induced Multi-view Learning Network.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-10-28 , DOI: 10.1109/tip.2019.2947155
Xiaopeng Zhang , Yang Yang , Hongkai Xiong , Jiashi Feng

This paper introduces a new model for Weakly Supervised Object Localization (WSOL) problems where only image-level supervision is provided. The key to solve such problems is to infer the object locations accurately. Previous methods usually model the missing object locations as latent variables, and alternate between updating their estimates and learning a detector accordingly. However, the performance of such alternative optimization is sensitive to the quality of the initial latent variables and the resulted localization model is prone to overfitting to improper localizations. To address these issues, we develop a Prior-induced Multi-view Learning Localization Network (PML-LocNet) which exploits both view diversity and sample diversity to improve object localization. In particular, the view diversity is imposed by a two-phase multi-view learning strategy, with which the complementarity among learned features from different views and the consensus among localized instances from each view are leveraged to benefit localization. The sample diversity is pursued by harnessing coarse-to-fine priors at both image and instance levels. With these priors, more emphasis would go to the reliable samples and the contributions of the unreliable ones would be decreased, such that the intrinsic characteristics of each sample can be exploited to make the model more robust during network learning. PML-LocNet can be easily combined with existing WSOL models to further improve the localization accuracy. Its effectiveness has been proved experimentally. Notably, it achieves 69.3% CorLoc and 50.4% mAP on PASCAL VOC 2007, surpassing the state-of-the-arts by a large margin.

中文翻译：

PML-LocNet：利用先验诱导的多视图学习网络改进对象定位。

本文介绍了一种针对弱监督对象定位（WSOL）问题的新模型，其中仅提供图像级监督。解决此类问题的关键是准确推断物体位置。以前的方法通常将丢失的物体位置建模为潜在变量，并在更新其估计和相应地学习检测器之间交替。然而，这种替代优化的性能对初始潜在变量的质量敏感，并且所得的定位模型容易过度拟合不正确的定位。为了解决这些问题，我们开发了一种先验诱导的多视图学习定位网络（PML-LocNet），它利用视图多样性和样本多样性来改进对象定位。特别是，视图多样性是由两阶段多视图学习策略强加的，利用不同视图中学习到的特征之间的互补性以及每个视图中本地化实例之间的共识来有利于本地化。通过在图像和实例级别利用从粗到细的先验来追求样本多样性。有了这些先验，就会更加重视可靠样本，而减少不可靠样本的贡献，这样就可以利用每个样本的内在特征，使模型在网络学习过程中更加鲁棒。 PML-LocNet可以轻松地与现有的WSOL模型结合，进一步提高定位精度。其有效性已被实验证明。值得注意的是，它在 PASCAL VOC 2007 上实现了 69.3% CorLoc 和 50.4% mAP，大幅超越了现有技术。

更新日期：2020-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11