Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning
International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2020-01-30 , DOI: 10.1007/s11263-020-01293-3
Xiang Wang , Sifei Liu , Huimin Ma , Ming-Hsuan Yang

Weakly-supervised semantic segmentation is a challenging task as no pixel-wise label information is provided for training. Recent methods have exploited classification networks to localize objects by selecting regions with strong response. While such response map provides sparse information, however, there exist strong pairwise relations between pixels in natural images, which can be utilized to propagate the sparse map to a much denser one. In this paper, we propose an iterative algorithm to learn such pairwise relations, which consists of two branches, a unary segmentation network which learns the label probabilities for each pixel, and a pairwise affinity network which learns affinity matrix and refines the probability map generated from the unary network. The refined results by the pairwise network are then used as supervision to train the unary network, and the procedures are conducted iteratively to obtain better segmentation progressively. To learn reliable pixel affinity without accurate annotation, we also propose to mine confident regions. We show that iteratively training this framework is equivalent to optimizing an energy function with convergence to a local minimum. Experimental results on the PASCAL VOC 2012 and COCO datasets demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods.

中文翻译：

通过迭代亲和学习进行弱监督语义分割

弱监督语义分割是一项具有挑战性的任务，因为没有为训练提供像素级标签信息。最近的方法利用分类网络通过选择具有强响应的区域来定位对象。虽然这样的响应图提供了稀疏信息，但是，自然图像中的像素之间存在很强的成对关系，这可用于将稀疏图传播到更密集的图。在本文中，我们提出了一种迭代算法来学习这种成对关系，它由两个分支组成，一个一元分割网络学习每个像素的标签概率，一个成对亲和网络学习亲和矩阵并细化从一元网络。然后将成对网络的细化结果用作训练一元网络的监督，并迭代地进行该过程以逐步获得更好的分割。为了在没有准确注释的情况下学习可靠的像素亲和力，我们还建议挖掘置信区域。我们表明，迭代训练这个框架相当于优化一个收敛到局部最小值的能量函数。在 PASCAL VOC 2012 和 COCO 数据集上的实验结果表明，所提出的算法与最先进的方法相比表现良好。我们表明，迭代训练这个框架相当于优化一个收敛到局部最小值的能量函数。在 PASCAL VOC 2012 和 COCO 数据集上的实验结果表明，所提出的算法与最先进的方法相比表现良好。我们表明，迭代训练这个框架相当于优化一个收敛到局部最小值的能量函数。在 PASCAL VOC 2012 和 COCO 数据集上的实验结果表明，所提出的算法与最先进的方法相比表现良好。

更新日期：2020-01-30

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11