当前位置: X-MOL 学术ACM Trans. Intell. Syst. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Neighborhood Component Analysis for Visual Similarity Modeling
ACM Transactions on Intelligent Systems and Technology ( IF 7.2 ) Pub Date : 2020-05-04 , DOI: 10.1145/3375787
Xueliang Liu 1 , Xun Yang 2 , Meng Wang 1 , Richang Hong 1
Affiliation  

Learning effective visual similarity is an essential problem in multimedia research. Despite the promising progress made in recent years, most existing approaches learn visual features and similarities in two separate stages, which inevitably limits their performance. Once useful information has been lost in the feature extraction stage, it can hardly be recovered later. This article proposes a novel end-to-end approach for visual similarity modeling, called deep neighborhood component analysis , which discriminatively trains deep neural networks to jointly learn visual features and similarities. Specifically, we first formulate a metric learning objective that maximizes the intra-class correlations and minimizes the inter-class correlations under the neighborhood component analysis criterion, and then train deep convolutional neural networks to learn a nonlinear mapping that projects visual instances from original feature space to a discriminative and neighborhood-structure-preserving embedding space, thus resulting in better performance. We conducted extensive evaluations on several widely used and challenging datasets, and the impressive results demonstrate the effectiveness of our proposed approach.

中文翻译:

视觉相似性建模的深度邻域成分分析

学习有效的视觉相似度是多媒体研究中的一个基本问题。尽管近年来取得了可喜的进展,但大多数现有方法在两个不同的阶段学习视觉特征和相似性,这不可避免地限制了它们的性能。一旦在特征提取阶段丢失了有用的信息,以后就很难恢复。本文提出了一种新颖的端到端视觉相似度建模方法,称为深度邻域成分分析,它有区别地训练深度神经网络以联合学习视觉特征和相似性。具体来说,我们首先制定了一个度量学习目标,在邻域成分分析准则下最大化类内相关性并最小化类间相关性,然后训练深度卷积神经网络来学习从原始特征空间投射视觉实例的非线性映射到一个有区别的和保留邻域结构的嵌入空间,从而产生更好的性能。我们对几个广泛使用且具有挑战性的数据集进行了广泛的评估,令人印象深刻的结果证明了我们提出的方法的有效性。
更新日期:2020-05-04
down
wechat
bug