当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Novel Strategy to Balance the Results of Cross-Modal Hashing
Pattern Recognition ( IF 8 ) Pub Date : 2020-11-01 , DOI: 10.1016/j.patcog.2020.107523
Fangming Zhong , Zhikui Chen , Geyong Min , Feng Xia

Abstract Hashing methods for cross-modal retrieval has drawn increasing research interests and has been widely studied in recent years due to the explosive growth of multimedia big data. However, a significant phenomenon which has been ignored is that there is a large gap between the results of cross-modal hashing in most cases. For example, the results of Text-to-Image frequently outperform that of Image-to-Text with a large margin. In this paper, we propose a strategy named semantic augmentation to improve and balance the results of cross-modal hashing. An intermediate semantic space is constructed to re-align the feature representations that embedded with weak semantic information. By using the intermediate semantic space, the semantic information of visual features can be further augmented before being sent to cross-modal hashing algorithms. Extensive experiments are carried out on four datasets via seven state-of-the-art cross-modal hashing methods. Compared against the results without semantic augmentation, the Image-to-Text results of these methods with semantic augmentation are improved considerably, which demonstrates the effectiveness of the proposed semantic augmentation strategy in bridging the gap between the results of cross-modal retrieval. Additional experiments are conducted on the real-valued, semi-supervised, semi-paired, partial-paired, and unpaired cross-modal retrieval methods, the results further indicates the effectiveness of our strategy in improving performance of cross-modal retrieval.

中文翻译:

一种平衡跨模式散列结果的新策略

摘要 近年来,由于多媒体大数据的爆炸式增长,跨模态检索的哈希方法引起了越来越多的研究兴趣并得到广泛研究。然而,一个被忽视的重要现象是,在大多数情况下,跨模式散列的结果之间存在很大差距。例如,Text-to-Image 的结果经常以很大的差距优于 Image-to-Text。在本文中,我们提出了一种名为语义增强的策略来改进和平衡跨模式散列的结果。构建一个中间语义空间来重新对齐嵌入了弱语义信息的特征表示。通过使用中间语义空间,可以进一步增强视觉特征的语义信息,然后再发送到跨模态哈希算法。通过七种最先进的跨模式散列方法对四个数据集进行了广泛的实验。与没有语义增强的结果相比,这些带有语义增强的方法的 Image-to-Text 结果得到了显着改善,这证明了所提出的语义增强策略在弥合跨模态检索结果之间的差距方面的有效性。对实值、半监督、半配对、部分配对和非配对跨模态检索方法进行了额外的实验,结果进一步表明了我们的策略在提高跨模态检索性能方面的有效性。这些具有语义增强的方法的图像到文本结果得到了显着改善,这证明了所提出的语义增强策略在弥合跨模态检索结果之间的差距方面的有效性。对实值、半监督、半配对、部分配对和非配对跨模态检索方法进行了额外的实验,结果进一步表明了我们的策略在提高跨模态检索性能方面的有效性。这些具有语义增强的方法的图像到文本结果得到了显着改善,这证明了所提出的语义增强策略在弥合跨模态检索结果之间的差距方面的有效性。对实值、半监督、半配对、部分配对和非配对跨模态检索方法进行了额外的实验,结果进一步表明了我们的策略在提高跨模态检索性能方面的有效性。
更新日期:2020-11-01
down
wechat
bug