Weakly supervised semantic segmentation using distinct class specific saliency maps,Computer Vision and Image Understanding

当前位置： X-MOL 学术 › Comput. Vis. Image Underst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Weakly supervised semantic segmentation using distinct class specific saliency maps
Computer Vision and Image Understanding ( IF 4.5 ) Pub Date : 2018-09-15 , DOI: 10.1016/j.cviu.2018.08.006
Wataru Shimoda , Keiji Yanai

Weakly supervised segmentation has drawn considerable attention, because of the high costs associated with the creation of pixel-wise annotated image datasets that are used for training fully supervised segmentation models. We propose a weakly supervised semantic segmentation method using CNN-based class-specific saliency maps and fully connected CRF. To obtain distinct class-specific saliency maps (DCSM) that can be used as unary potentials of CRF, we propose a novel method of estimating class saliency maps, which significantly improves the method proposed by Simonyan et al. (2014) through the following improvements: (1) using CNN derivatives with respect to feature maps of the intermediate convolutional layers with up-sampling instead of an input image; (2) subtracting the saliency maps of other classes from the saliency maps of the target class to differentiate target objects among other objects; (3) aggregating multiple-scale class saliency maps to compensate for the low resolution in feature maps. In addition, we propose the use of a novel algorithm for estimating segmentation “Easiness” combined with the proposed saliency-based method. Wei et al. (2016) recently demonstrated that a fully supervised segmentation model enhanced the performance of weakly supervised segmentation by training the model using the estimated initial masks in a weakly supervised setting. However, the initial estimated masks tend to include some noise, which sometimes produces erroneous results. Therefore, we focus on improving the quality of the initial estimated masks for training a fully supervised segmentation model. We propose a method for retrieving “good seeds” by predicting the segmentation “Easiness” of images based on the consistency of the outputs under different conditions. We illustrate that our proposed method can retrieve “good seeds”. Despite of the trade-off between training data quality and the number of training images, retrieved images can improve the accuracy of weakly supervised segmentation by combining data augmentation.

中文翻译：

使用不同的类特定显着性图进行弱监督语义分割

由于与用于训练完全监督的分割模型的像素级带注释的图像数据集的创建相关的高成本，弱监督的分割已引起了相当大的关注。我们提出了一种弱监督语义分割方法，该方法使用基于CNN的类特定显着性图和完全连接的CRF。为了获得可以用作CRF一元潜力的不同类别特定显着性图（DCSM），我们提出了一种估计类别显着性图的新方法，该方法显着改进了Simonyan等人提出的方法。（2014）进行了以下改进：（1）使用CNN导数相对于中间卷积层的特征图进行上采样而不是输入图像；（2）从目标类别的显着图中减去其他类别的显着图，以区分目标对象。（3）聚合多尺度类显着性图以补偿特征图中的低分辨率。此外，我们结合提出的基于显着性的方法，提出了一种新颖的算法来估计分割“易用性”。Wei等。（2016年）最近证明，通过在弱监督环境中使用估计的初始蒙版训练模型，完全监督的分割模型可以增强弱监督分割的性能。但是，最初估计的掩码往往会包含一些噪声，有时会产生错误的结果。因此，我们专注于提高初始估计蒙版的质量，以训练完全监督的分割模型。我们提出了一种根据不同条件下输出的一致性来预测图像的分割“易感性”来检索“良好种子”的方法。我们说明了我们提出的方法可以检索“好的种子”。尽管在训练数据质量和训练图像数量之间进行了权衡，但检索到的图像可以通过组合数据增强来提高弱监督分割的准确性。

更新日期：2020-01-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>