Multi-modal generative adversarial network for zero-shot learning,Knowledge-Based Systems

当前位置： X-MOL 学术 › Knowl. Based Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-modal generative adversarial network for zero-shot learning
Knowledge-Based Systems ( IF 7.2 ) Pub Date : 2020-04-07 , DOI: 10.1016/j.knosys.2020.105847
Zhong Ji , Kexin Chen , Junyue Wang , Yunlong Yu , Zhongfei Zhang

In this paper, we propose a novel approach for Zero-Shot Learning (ZSL), where the test instances are from the novel categories that no visual data are available during training. The existing approaches typically address ZSL by embedding the visual features into a category-shared semantic space. However, these embedding-based approaches easily suffer from the “heterogeneity gap” issue since a single type of class semantic prototype cannot characterize the categories well. To alleviate this issue, we assume that different class semantics reflect different views of the corresponding class, and thus fuse various types of class semantic prototypes resided in different semantic spaces with a feature fusion network to generate pseudo visual features. Through the adversarial mechanism of the real visual features and the fused pseudo visual features, the complementary semantics in various spaces are effectively captured. Experimental results on three benchmark datasets demonstrate that the proposed approach achieves impressive performances on both traditional ZSL and generalized ZSL tasks.

中文翻译：

零射击学习的多模式生成对抗网络

在本文中，我们提出了一种零发散学习（ZSL）的新颖方法，其中测试实例来自新颖类别，在训练过程中没有可视数据可用。现有方法通常通过将视觉功能嵌入到类别共享的语义空间中来解决ZSL。但是，由于单一类型的类语义原型无法很好地描述类别，因此这些基于嵌入的方法很容易遭受“异构性差距”的困扰。为了缓解这个问题，我们假设不同的类语义反映了对应类的不同视图，因此将驻留在不同语义空间中的各种类型的类语义原型与特征融合网络融合，以生成伪视觉特征。通过真实视觉特征和融合的伪视觉特征的对抗机制，有效地捕获了各个空间中的互补语义。在三个基准数据集上的实验结果表明，该方法在传统ZSL和广义ZSL任务上均实现了令人印象深刻的性能。

更新日期：2020-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11