当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Inductive Structure Consistent Hashing via Flexible Semantic Calibration
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2020-09-09 , DOI: 10.1109/tnnls.2020.3018790
Zheng Zhang , Luyao Liu , Yadan Luo , Zi Huang , Fumin Shen , Heng Tao Shen , Guangming Lu

Semantic-preserving hashing establishes efficient multimedia retrieval by transferring knowledge from original data to hash codes so that the latter can preserve the underlying visual and semantic similarities. However, it becomes a crucial bottleneck: how to effectively bridge the trilateral domain gaps (i.e., the visual, semantic, and hashing spaces) to further improve the retrieval accuracy. In this article, we propose an inductive structure consistent hashing (ISCH) method, which can interactively coordinate the semantic correlations between the visual feature space, the binary class space, and the discrete hashing space. Specifically, an inductive semantic space is formulated by a simple multilayer stacking class-encoder, which transforms the naive class information into flexible semantic embeddings. Meanwhile, we design a semantic dictionary learning model to facilitate the bilateral visual-semantic bridging and guide the class-encoder toward reliable semantics, which could well alleviate the visual-semantic bias problem. In particular, the visual descriptors and respective semantic class representations are regularized with a coinciding alignment module. In order to generate privileged hash codes, we further explore semantic and prototype binary code learning to jointly quantify the semantic and latent visual representations into unified discrete hash codes. Moreover, an efficient optimization algorithm is developed to address the resulting discrete programming problem. Comprehensive experiments conducted on four large-scale data sets, i.e., CIFAR-10, NUSWIDE, ImageNet, and MSCOCO, demonstrate the superiority of our method over the state-of-the-art alternatives against different evaluation protocols.

中文翻译:

通过灵活语义校准的归纳结构一致性哈希

语义保留哈希通过将知识从原始数据转移到哈希码来建立高效的多媒体检索,以便后者可以保留潜在的视觉和语义相似性。然而,它成为一个关键的瓶颈:如何有效地弥合三边领域的差距(即视觉、语义和哈希空间)以进一步提高检索精度。在本文中,我们提出了一种归纳结构一致性哈希(ISCH)方法,该方法可以交互协调视觉特征空间、二元类空间和离散哈希空间之间的语义相关性。具体来说,归纳语义空间由一个简单的多层堆叠类编码器构成,它将朴素的类信息转换为灵活的语义嵌入。同时,我们设计了一个语义字典学习模型来促进双边视觉语义桥接,并引导类编码器走向可靠的语义,这可以很好地缓解视觉语义偏差问题。特别是,视觉描述符和相应的语义类表示用重合的对齐模块进行正则化。为了生成特权哈希码,我们进一步探索语义和原型二进制代码学习,以将语义和潜在视觉表示联合量化为统一的离散哈希码。此外,开发了一种有效的优化算法来解决由此产生的离散规划问题。在四个大规模数据集上进行的综合实验,即 CIFAR-10、NUSWIDE、ImageNet 和 MSCOCO,
更新日期:2020-09-09
down
wechat
bug