Generalized Multi-View Embedding for Visual Recognition and Cross-Modal Retrieval,IEEE Transactions on Cybernetics

当前位置： X-MOL 学术 › IEEE Trans. Cybern. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Generalized Multi-View Embedding for Visual Recognition and Cross-Modal Retrieval
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 2017-09-07 , DOI: 10.1109/tcyb.2017.2742705
Guanqun Cao , Alexandros Iosifidis , Ke Chen , Moncef Gabbouj

In this paper, the problem of multi-view embedding from different visual cues and modalities is considered. We propose a unified solution for subspace learning methods using the Rayleigh quotient, which is extensible for multiple views, supervised learning, and nonlinear embeddings. Numerous methods including canonical correlation analysis, partial least square regression, and linear discriminant analysis are studied using specific intrinsic and penalty graphs within the same framework. Nonlinear extensions based on kernels and (deep) neural networks are derived, achieving better performance than the linear ones. Moreover, a novel multi-view modular discriminant analysis is proposed by taking the view difference into consideration. We demonstrate the effectiveness of the proposed multi-view embedding methods on visual object recognition and cross-modal image retrieval, and obtain superior results in both applications compared to related methods.

中文翻译：

用于视觉识别和跨模态检索的广义多视图嵌入

在本文中，考虑了来自不同视觉线索和模态的多视图嵌入问题。我们提出了使用瑞利商的子空间学习方法的统一解决方案，该解决方案可扩展用于多视图、监督学习和非线性嵌入。包括典型相关分析、偏最小二乘回归和线性判别分析在内的许多方法都在同一框架内使用特定的内在图和罚分图进行了研究。推导了基于内核和（深度）神经网络的非线性扩展，实现了比线性扩展更好的性能。此外，通过考虑视图差异，提出了一种新颖的多视图模块化判别分析。我们证明了所提出的多视图嵌入方法在视觉对象识别和跨模式图像检索方面的有效性，并且与相关方法相比，在这两种应用中都获得了优异的结果。

更新日期：2017-09-07

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11