Rich Embedding Features for One-Shot Semantic Segmentation,IEEE Transactions on Neural Networks and Learning Systems

当前位置： X-MOL 学术 › IEEE Trans. Neural Netw. Learn. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Rich Embedding Features for One-Shot Semantic Segmentation
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.4 ) Pub Date : 2021-06-23 , DOI: 10.1109/tnnls.2021.3081693
Xiaolin Zhang ₁ , Yunchao Wei ₂ , Zhao Li ₃ , Chenggang Yan ₄ , Yi Yang ₅

Affiliation

One-shot semantic segmentation poses the challenging task of segmenting object regions from unseen categories with only one annotated example as guidance. Thus, how to effectively construct robust feature representations from the guidance image is crucial to the success of one-shot semantic segmentation. To this end, we propose in this article a simple, yet effective approach named rich embedding features (REFs). Given a reference image accompanied with its annotated mask, our REF constructs rich embedding features of the support object from three perspectives: 1) global embedding to capture the general characteristics; 2) peak embedding to capture the most discriminative information; 3) adaptive embedding to capture the internal long-range dependencies. By combining these informative features, we can easily harvest sufficient and rich guidance even from a single reference image. In addition to REF, we further propose a simple depth-priority context module to obtain useful contextual cues from the query image. This successfully raises the performance of one-shot semantic segmentation to a new level. We conduct experiments on pattern analysis, statical modeling and computational learning (Pascal) visual object classes (VOC) 2012 and common object in context (COCO) to demonstrate the effectiveness of our approach.

中文翻译：

用于一次性语义分割的丰富嵌入功能

一次性语义分割提出了从看不见的类别中分割对象区域的具有挑战性的任务，只有一个带注释的示例作为指导。因此，如何有效地从引导图像中构建鲁棒的特征表示对于一次性语义分割的成功至关重要。为此，我们在本文中提出了一种简单而有效的方法，称为丰富的嵌入特征（REFs）。给定带有注释掩码的参考图像，我们的 REF 从三个角度构建了支持对象的丰富嵌入特征：1）全局嵌入以捕获一般特征；2）峰值嵌入以捕获最具辨别力的信息；3) 自适应嵌入以捕获内部远程依赖关系。通过结合这些信息特征，即使从单个参考图像中，我们也可以轻松获得足够丰富的指导。除了 REF，我们进一步提出了一个简单的深度优先上下文模块，以从查询图像中获取有用的上下文线索。这成功地将一次性语义分割的性能提升到了一个新的水平。我们对模式分析、静态建模和计算学习 (Pascal) 视觉对象类 (VOC) 2012 和上下文中的公共对象 (COCO) 进行了实验，以证明我们方法的有效性。

更新日期：2021-06-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>