当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Multi-View Enhancement Hashing for Image Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 2-25-2020 , DOI: 10.1109/tpami.2020.2975798
Chenggang Yan , Biao Gong , Yuxuan Wei , Yue Gao

Hashing is an efficient method for nearest neighbor search in large-scale data space by embedding high-dimensional feature descriptors into a similarity preserving Hamming space with a low dimension. However, large-scale high-speed retrieval through binary code has a certain degree of reduction in retrieval accuracy compared to traditional retrieval methods. We have noticed that multi-view methods can well preserve the diverse characteristics of data. Therefore, we try to introduce the multi-view deep neural network into the hash learning field, and design an efficient and innovative retrieval model, which has achieved a significant improvement in retrieval performance. In this paper, we propose a supervised multi-view hash model which can enhance the multi-view information through neural networks. This is a completely new hash learning method that combines multi-view and deep learning methods. The proposed method utilizes an effective view stability evaluation method to actively explore the relationship among views, which will affect the optimization direction of the entire network. We have also designed a variety of multi-data fusion methods in the Hamming space to preserve the advantages of both convolution and multi-view. In order to avoid excessive computing resources on the enhancement procedure during retrieval, we set up a separate structure called memory network which participates in training together. The proposed method is systematically evaluated on the CIFAR-10, NUS-WIDE and MS-COCO datasets, and the results show that our method significantly outperforms the state-of-the-art single-view and multi-view hashing methods.

中文翻译:


用于图像检索的深度多视图增强哈希



散列是一种在大规模数据空间中进行最近邻搜索的有效方法,它通过将高维特征描述符嵌入到低维的相似性保留汉明空间中。然而,通过二进制码进行大规模高速检索,与传统检索方法相比,检索精度有一定程度的降低。我们注意到多视图方法可以很好地保留数据的多样性特征。因此,我们尝试将多视图深度神经网络引入哈希学习领域,设计出高效、创新的检索模型,取得了检索性能的显着提升。在本文中,我们提出了一种有监督的多视图哈希模型,可以通过神经网络增强多视图信息。这是一种全新的哈希学习方法,结合了多视图和深度学习方法。该方法利用有效的视图稳定性评估方法来主动探索视图之间的关系,这将影响整个网络的优化方向。我们还在汉明空间中设计了多种多数据融合方法,以保留卷积和多视图的优点。为了避免检索过程中的增强过程占用过多的计算资源,我们建立了一个单独的结构,称为内存网络,一起参与训练。所提出的方法在 CIFAR-10、NUS-WIDE 和 MS-COCO 数据集上进行了系统评估,结果表明我们的方法显着优于最先进的单视图和多视图哈希方法。
更新日期:2024-08-22
down
wechat
bug