Zero-shot Cross-media Embedding Learning with Dual Adversarial Distribution Network,IEEE Transactions on Circuits and Systems for Video Technology

当前位置： X-MOL 学术 › IEEE Trans. Circ. Syst. Video Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Zero-shot Cross-media Embedding Learning with Dual Adversarial Distribution Network
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.3 ) Pub Date : 2020-04-01 , DOI: 10.1109/tcsvt.2019.2900171
Jingze Chi , Yuxin Peng

Existing cross-media retrieval methods are mainly based on the condition where the training set covers all the categories in the testing set, which lack extensibility to retrieve data of new categories. Thus, zero-shot cross-media retrieval has been a promising direction in practical application, aiming to retrieve data of new categories (unseen categories), only with data of limited known categories (seen categories) for training. It is challenging for not only the heterogeneous distributions across different media types, but also the inconsistent semantics across seen and unseen categories need to be handled. To address the above issues, we propose dual adversarial distribution network (DADN), to learn common embeddings and explore the knowledge from word-embeddings of different categories. The main contributions are as follows. First, zero-shot cross-media dual generative adversarial networks architecture is proposed, in which two kinds of generative adversarial networks (GANs) for common embedding generation and representation reconstruction form dual processes. The dual GANs mutually promote to model semantic and underlying structure information, which generalizes across different categories on heterogeneous distributions and boosts correlation learning. Second, distribution matching with maximum mean discrepancy criterion is proposed to combine with dual GANs, which enhances distribution matching between common embeddings and category word-embeddings. Finally, adversarial inter-media metric constraint is proposed with an inter-media loss and a quadruplet loss, which further model the inter-media correlation information and improve semantic ranking ability. The experiments on four widely used cross-media datasets demonstrate the effectiveness of our DADN approach.

中文翻译：

具有双重对抗分发网络的零样本跨媒体嵌入学习

现有的跨媒体检索方法主要基于训练集覆盖测试集中所有类别的条件，缺乏检索新类别数据的可扩展性。因此，零样本跨媒体检索在实际应用中是一个很有前景的方向，旨在检索新类别（未见类别）的数据，仅使用有限的已知类别（已见类别）数据进行训练。不仅需要处理跨不同媒体类型的异构分布，而且还需要处理跨可见和不可见类别的不一致语义。为了解决上述问题，我们提出了双重对抗分布网络（DADN），以学习常见的嵌入并从不同类别的词嵌入中探索知识。主要贡献如下。第一的，提出了零样本跨媒体双生成对抗网络架构，其中用于公共嵌入生成和表示重建的两种生成对抗网络（GAN）形成双重过程。双 GAN 相互促进对语义和底层结构信息的建模，从而在异构分布上的不同类别中进行泛化，并促进相关学习。其次，提出了具有最大均值差异准则的分布匹配与双 GAN 相结合，增强了常见嵌入和类别词嵌入之间的分布匹配。最后，提出了具有媒体间损失和四元组损失的对抗性媒体间度量约束，进一步对媒体间相关信息进行建模并提高语义排序能力。

更新日期：2020-04-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11