Deep Collaborative Discrete Hashing with Semantic-Invariant Structure Construction,IEEE Transactions on Multimedia

当前位置： X-MOL 学术 › IEEE Trans. Multimedia › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Collaborative Discrete Hashing with Semantic-Invariant Structure Construction
IEEE Transactions on Multimedia ( IF 7.3 ) Pub Date : 2020-01-01 , DOI: 10.1109/tmm.2020.2995267
Zijian Wang , Zheng Zhang , Yandan Luo , Zi Huang , Heng Tao Shen

While deep hashing has made great progress in large-scale multimedia retrieval, most of the existing approaches under-explore the semantic correlations and neglect the effect of context-aware visual learning. In this paper, we propose a dual-stream learning framework, termed as Deep Collaborative Discrete Hashing (DCDH), which constructs a discriminative common discrete space by collaboratively incorporating the shared and individual semantics deduced from visual features and semantics. Specifically, DCDH generates context-aware representations by employing the outer product of visual embeddings and semantic encodings. To further preserve the original semantics and alleviate the class imbalance problem, we introduce the focal loss to take advantage of frequent and rare concepts. Furthermore, a common binary code space is constructed based on the joint learning of the visual representations, the context-aware representations, and the label distribution calibration. Three losses, i.e., the pairwise similarity loss, the quantization loss, and the balanced classification loss, are collaboratively optimized in the general learning framework of DCDH. Extensive experiments conducted on three large-scale benchmark datasets demonstrate the superiority of the proposed method, yielding the state-of-the-art image retrieval performance.

中文翻译：

具有语义不变结构构建的深度协作离散散列

虽然深度哈希在大规模多媒体检索方面取得了很大进展，但大多数现有方法都没有充分探索语义相关性，忽视了上下文感知视觉学习的影响。在本文中，我们提出了一种双流学习框架，称为深度协作离散散列（DCDH），它通过协作结合从视觉特征和语义推导出的共享和个体语义来构建一个有区别的公共离散空间。具体来说，DCDH 通过使用视觉嵌入和语义编码的外积来生成上下文感知表示。为了进一步保留原始语义并缓解类不平衡问题，我们引入了焦点损失以利用频繁和稀有概念。此外，基于视觉表示、上下文感知表示和标签分布校准的联合学习，构建了一个公共二进制代码空间。在DCDH的通用学习框架中协同优化了三种损失，即成对相似性损失、量化损失和平衡分类损失。在三个大规模基准数据集上进行的大量实验证明了所提出方法的优越性，产生了最先进的图像检索性能。

更新日期：2020-01-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>