Coarse-to-Fine Semantic Segmentation From Image-Level Labels.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Coarse-to-Fine Semantic Segmentation From Image-Level Labels.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-07-12 , DOI: 10.1109/tip.2019.2926748
Longlong Jing , Yucheng Chen , Yingli Tian

Deep neural network-based semantic segmentation generally requires large-scale cost extensive annotations for training to obtain better performance. To avoid pixel-wise segmentation annotations that are needed for most methods, recently some researchers attempted to use object-level labels (e.g., bounding boxes) or image-level labels (e.g., image categories). In this paper, we propose a novel recursive coarse-to-fine semantic segmentation framework based on only image-level category labels. For each image, an initial coarse mask is first generated by a convolutional neural network-based unsupervised foreground segmentation model and then is enhanced by a graph model. The enhanced coarse mask is fed to a fully convolutional neural network to be recursively refined. Unlike the existing image-level label-based semantic segmentation methods, which require labeling of all categories for images that contain multiple types of objects, our framework only needs one label for each image and can handle images that contain multi-category objects. Only trained on ImageNet, our framework achieves comparable performance on the PASCAL VOC dataset with other image-level label-based state-of-the-art methods of semantic segmentation. Furthermore, our framework can be easily extended to foreground object segmentation task and achieves comparable performance with the state-of-the-art supervised methods on the Internet object dataset.

中文翻译：

图像级标签的从粗到细的语义分割。

基于深度神经网络的语义分段通常需要大规模花费大量注释来进行训练以获得更好的性能。为了避免大多数方法所需的逐像素分段注释，最近一些研究人员尝试使用对象级标签（例如边界框）或图像级标签（例如图像类别）。在本文中，我们提出了一种仅基于图像级类别标签的新型递归从粗到细语义分割框架。对于每个图像，首先通过基于卷积神经网络的无监督前景分割模型生成初始的粗糙遮罩，然后通过图形模型对其进行增强。增强的粗糙蒙版被馈送到完全卷积的神经网络以进行递归优化。与现有的基于图片级标签的语义分割方法不同，对于包含多种类型对象的图像，需要为所有类别添加标签，我们的框架只需要为每个图像添加一个标签，即可处理包含多个类别对象的图像。仅在ImageNet上进行训练，我们的框架在PASCAL VOC数据集上的性能可与其他基于图像级标签的最新语义分割方法相媲美。此外，我们的框架可以轻松地扩展到前景对象分割任务，并可以与Internet对象数据集上最新的监督方法实现可比的性能。我们的框架在PASCAL VOC数据集上的性能可与其他基于图像级标签的最新语义分割方法相媲美。此外，我们的框架可以轻松地扩展到前景对象分割任务，并可以与Internet对象数据集上最新的监督方法实现可比的性能。我们的框架在PASCAL VOC数据集上的性能可与其他基于图像级标签的最新语义分割方法相媲美。此外，我们的框架可以轻松地扩展到前景对象分割任务，并可以与Internet对象数据集上最新的监督方法实现可比的性能。

更新日期：2020-04-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11