Deep Unsupervised Hashing for Large-Scale Cross-Modal Retrieval Using Knowledge Distillation Model,Computational Intelligence and Neuroscience

当前位置： X-MOL 学术 › Comput. Intell. Neurosci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Unsupervised Hashing for Large-Scale Cross-Modal Retrieval Using Knowledge Distillation Model
Computational Intelligence and Neuroscience ( IF 3.120 ) Pub Date : 2021-07-17 , DOI: 10.1155/2021/5107034
Mingyong Li ₁ , Qiqi Li ₁ , Lirong Tang ₁ , Shuang Peng ₁ , Yan Ma ₁ , Degang Yang ₁

Affiliation

Cross-modal hashing encodes heterogeneous multimedia data into compact binary code to achieve fast and flexible retrieval across different modalities. Due to its low storage cost and high retrieval efficiency, it has received widespread attention. Supervised deep hashing significantly improves search performance and usually yields more accurate results, but requires a lot of manual annotation of the data. In contrast, unsupervised deep hashing is difficult to achieve satisfactory performance due to the lack of reliable supervisory information. To solve this problem, inspired by knowledge distillation, we propose a novel unsupervised knowledge distillation cross-modal hashing method based on semantic alignment (SAKDH), which can reconstruct the similarity matrix using the hidden correlation information of the pretrained unsupervised teacher model, and the reconstructed similarity matrix can be used to guide the supervised student model. Specifically, firstly, the teacher model adopted an unsupervised semantic alignment hashing method, which can construct a modal fusion similarity matrix. Secondly, under the supervision of teacher model distillation information, the student model can generate more discriminative hash codes. Experimental results on two extensive benchmark datasets (MIRFLICKR-25K and NUS-WIDE) show that compared to several representative unsupervised cross-modal hashing methods, the mean average precision (MAP) of our proposed method has achieved a significant improvement. It fully reflects its effectiveness in large-scale cross-modal data retrieval.

中文翻译：

使用知识蒸馏模型进行大规模跨模态检索的深度无监督散列

跨模态哈希将异构多媒体数据编码为紧凑的二进制代码，以实现跨不同模态的快速灵活检索。因其存储成本低、检索效率高而受到广泛关注。有监督的深度散列显着提高了搜索性能，通常会产生更准确的结果，但需要对数据进行大量手动注释。相比之下，由于缺乏可靠的监督信息，无监督的深度哈希很难达到令人满意的性能。为了解决这个问题，受知识蒸馏的启发，我们提出了一种基于语义对齐（SAKDH）的新型无监督知识蒸馏跨模态哈希方法，该方法可以利用预训练无监督教师模型的隐藏相关信息重构相似度矩阵，重建的相似度矩阵可用于指导有监督的学生模型。具体来说，首先，教师模型采用无监督语义对齐哈希方法，可以构建模态融合相似度矩阵。其次，在教师模型蒸馏信息的监督下，学生模型可以生成更具判别力的哈希码。在两个广泛的基准数据集（MIRFLICKR-25K 和 NUS-WIDE）上的实验结果表明，与几种代表性的无监督跨模态散列方法相比，我们提出的方法的平均精度（MAP）取得了显着的提高。充分体现了其在大规模跨模态数据检索中的有效性。教师模型采用无监督语义对齐哈希方法，可以构造模态融合相似度矩阵。其次，在教师模型蒸馏信息的监督下，学生模型可以生成更具判别力的哈希码。在两个广泛的基准数据集（MIRFLICKR-25K 和 NUS-WIDE）上的实验结果表明，与几种代表性的无监督跨模态散列方法相比，我们提出的方法的平均精度（MAP）取得了显着的提高。充分体现了其在大规模跨模态数据检索中的有效性。教师模型采用无监督语义对齐哈希方法，可以构造模态融合相似度矩阵。其次，在教师模型蒸馏信息的监督下，学生模型可以生成更具判别力的哈希码。在两个广泛的基准数据集（MIRFLICKR-25K 和 NUS-WIDE）上的实验结果表明，与几种具有代表性的无监督跨模态哈希方法相比，我们提出的方法的平均精度（MAP）取得了显着的提高。充分体现了其在大规模跨模态数据检索中的有效性。学生模型可以生成更具区分性的哈希码。在两个广泛的基准数据集（MIRFLICKR-25K 和 NUS-WIDE）上的实验结果表明，与几种具有代表性的无监督跨模态哈希方法相比，我们提出的方法的平均精度（MAP）取得了显着的提高。充分体现了其在大规模跨模态数据检索中的有效性。学生模型可以生成更具区分性的哈希码。在两个广泛的基准数据集（MIRFLICKR-25K 和 NUS-WIDE）上的实验结果表明，与几种具有代表性的无监督跨模态哈希方法相比，我们提出的方法的平均精度（MAP）取得了显着的提高。充分体现了其在大规模跨模态数据检索中的有效性。

更新日期：2021-07-18

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>