Context-sensitive zero-shot semantic segmentation model based on meta-learning,Neurocomputing

当前位置： X-MOL 学术 › Neurocomputing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Context-sensitive zero-shot semantic segmentation model based on meta-learning
Neurocomputing ( IF 6 ) Pub Date : 2021-09-02 , DOI: 10.1016/j.neucom.2021.08.120
Wenjian Wang _{1,

2,

3} , Lijuan Duan _{1,

2,

3} , Qing En _{1,

2,

3} , Baochang Zhang ₄

Affiliation

The zero-shot semantic segmentation requires models with a strong image understanding ability. The majority of current solutions are based on direct mapping or generation. These schemes are effective in dealing with the zero-shot recognition, but they cannot fully transfer the visual dependence between objects in more complex scenarios of semantic segmentation. More importantly, the predicted results become seriously biased to the seen-category in the training set, which makes it difficult to accurately recognize the unseen-category. In view of the above two problems, we propose a novel zero-shot semantic segmentation model based on meta-learning. It is observed that the pure semantic space expression has certain limitations for the zero-shot learning. Therefore, based on the original semantic migration, we first migrate the shared information in the visual space by adding a context-module, and then migrate it in the visual and semantic dual space. At the same time, in order to solve the problem of biasness, we improve the adaptability of the model parameters by adjusting the parameters of the dual-space through the meta-learning, so that it can successfully complete the segmentation even in the face of new categories without reference samples. Experiments show that our algorithm outperforms the existing best methods in the zero-shot segmentation on three datasets of Pascal-VOC 2012, Pascal-Context and Coco-stuff.

中文翻译：

基于元学习的上下文敏感的零镜头语义分割模型

零镜头语义分割需要模型具有很强的图像理解能力。当前的大多数解决方案都基于直接映射或生成。这些方案在处理零样本识别方面是有效的，但它们不能在更复杂的语义分割场景中完全转移对象之间的视觉依赖性。更重要的是，预测结果严重偏向于训练集中的可见类别，这使得准确识别不可见类别变得困难。针对以上两个问题，我们提出了一种新的基于元学习的零镜头语义分割模型。观察到纯语义空间表达对于零样本学习有一定的局限性。因此，在原有语义迁移的基础上，我们首先通过添加上下文模块来迁移视觉空间中的共享信息，然后将其迁移到视觉和语义双重空间中。同时，为了解决偏差问题，我们通过元学习调整对偶空间的参数来提高模型参数的适应性，使其即使面对没有参考样本的新类别。实验表明，我们的算法在 Pascal-VOC 2012、Pascal-Context 和 Coco-stuff 三个数据集上的零样本分割中优于现有的最佳方法。我们通过元学习调整对偶空间的参数来提高模型参数的适应性，使其即使面对没有参考样本的新类别也能顺利完成分割。实验表明，我们的算法在 Pascal-VOC 2012、Pascal-Context 和 Coco-stuff 三个数据集上的零样本分割中优于现有的最佳方法。我们通过元学习调整对偶空间的参数来提高模型参数的适应性，使其即使面对没有参考样本的新类别也能顺利完成分割。实验表明，我们的算法在 Pascal-VOC 2012、Pascal-Context 和 Coco-stuff 三个数据集上的零样本分割中优于现有的最佳方法。

更新日期：2021-09-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>