High-order nonlocal Hashing for unsupervised cross-modal retrieval

Zhang, Peng-Fei; Luo, Yadan; Huang, Zi; Xu, Xin-Shun; Song, Jingkuan

doi:10.1007/s11280-020-00859-y

High-order nonlocal Hashing for unsupervised cross-modal retrieval

Published: 27 February 2021

Volume 24, pages 563–583, (2021)
Cite this article

World Wide Web Aims and scope Submit manuscript

Peng-Fei Zhang¹,
Yadan Luo¹,
Zi Huang ORCID: orcid.org/0000-0002-9738-4949¹,
Xin-Shun Xu² &
…
Jingkuan Song³

1295 Accesses
32 Citations
Explore all metrics

Abstract

In light of the ability to enable efficient storage and fast query for big data, hashing techniques for cross-modal search have aroused extensive attention. Despite the great success achieved, unsupervised cross-modal hashing still suffers from lacking reliable similarity supervision and struggles with handling the heterogeneity issue between different modalities. To cope with these, in this paper, we devise a new deep hashing model, termed as High-order Nonlocal Hashing (HNH) to facilitate cross-modal retrieval with the following advantages. First, different from existing methods that mainly leverage low-level local-view similarity as the guidance for hashing learning, we propose a high-order affinity measure that considers the multi-modal neighbourhood structures from a nonlocal perspective, thereby comprehensively capturing the similarity relationships between data items. Second, a common representation is introduced to correlate different modalities. By enforcing the modal-specific descriptors and the common representation to be aligned with each other, the proposed HNH significantly bridges the modality gap and maintains the intra-consistency. Third, an effective affinity preserving objective function is delicately designed to generate high-quality binary codes. Extensive experiments evidence the superiority of the proposed HNH in unsupervised cross-modal retrieval tasks over the state-of-the-art baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Consistency Preserving Network for Unsupervised Cross-Modal Hashing

Multi-head Hashing with Orthogonal Decomposition for Cross-modal Retrieval

CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval

Article 22 February 2023

References

Andoni, A., Razenshteyn, I.: Optimal data-dependent hashing for approximate near neighbors. In: Proceedings of Annual Symposium on Foundations of Computer Science, pp 793–801 (2015)
Cao, Y., Long, M., Wang, J., Yang, Q., Yu, P.S.: Deep visual-semantic hashing for cross-modal retrieval. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1445–1454 (2016)
Chaidaroon, S., Ebesu, T., Fang, Y.: Deep semantic text hashing with weak supervision. In: Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 1109–1112 (2018)
Chaidaroon, S., Fang, Y.: Variational deep semantic hashing for text documents. In: Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 75–84 (2017)
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: A real-world Web image database from national university of singapore. In: Proceedings of ACM International Conference on Image and Video Retrieval, p. 48 (2009)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of Annual Symposium on Computational Geometry, pp. 253–262 (2004)
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2075–2082 (2014)
Feng, F., Wang, X., Li, R.: Cross-modal retrieval with correspondence autoencoder. In: Proceedings of ACM International Conference on Multimedia, pp. 7–16 (2014)
Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. In: Proceedings of International Conference on Very Large Data Bases, pp. 518–529 (1999)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2013)
Article Google Scholar
Hu, D., Nie, F., Li, X.: Deep binary reconstruction for cross-modal hashing. IEEE Trans. Multimed. 21(4), 973–985 (2018)
Article Google Scholar
Hu, P., Zhen, L., Peng, D., Liu, P.: Scalable deep multimodal learning for cross-modal retrieval. In: Proceedings of ACM SIGIR International conference on Research and Development in Information Retrieval, pp. 635–644 (2019)
Huang, F., Zhang, L., Yang, Y., Zhou, X.: Probability weighted compact feature for domain adaptive retrieval. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 9582–9591 (2020)
Huiskes, M.J., Lew, M.S.: The mir flickr retrieval evaluation. In: Proceedings of ACM International Conference on Multimedia Information Retrieval, pp. 39–43 (2008)
Jiang, Q.Y., Li, W.J.: Deep cross-modal hashing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3270–3278 (2017)
Kang, W.C., Li, W.J., Zhou, Z.H.: Column sampling based discrete supervised hashing. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 1230–1236 (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Kumar, S., Udupa, R.: Learning hash functions for cross-view similarity search. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1360–1365 (2011)
Li, C.X., Chen, Z.D., Zhang, P.F., Luo, X., Nie, L., Zhang, W., Xu, X.S.: Scratch: a scalable discrete matrix factorization hashing for cross-modal retrieval. In: Proceedings of ACM International Conference on Multimedia, pp. 1–9 (2018)
Li, W.J., Wang, S., Kang, W.C.: Feature learning based deep supervised hashing with pairwise labels. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1711–1717 (2016)
Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3864–3872 (2015)
Liu, J., Zhang, L.: Optimal projection guided transfer hashing for image retrieval. In: Proceedings of AAAI Conference on Artificial Intelligence, vol. 33, pp. 8754–8761 (2019)
Liu, W., Wang, J., Ji, R., Jiang, Y.G., Chang, S.F.: Supervised hashing with kernels. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2074–2081 (2012)
Liu, W., Wang, J., Kumar, S., Chang, S.F.: Hashing with graphs. In: Proceedings of International Conference on Machine Learning, pp. 1–8 (2011)
Long, M., Cao, Y., Wang, J., Yu, P.S.: Composite correlation quantization for efficient multimodal retrieval. In: Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 579–588 (2016)
Luo, X., Yin, X.Y., Nie, L., Song, X., Wang, Y., Xu, X.S.: Sdmch: Supervised discrete manifold-embedded cross-modal hashing. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2518–2524 (2018)
Luo, Y., Yang, Y., Shen, F., Huang, Z., Zhou, P., Shen, H.T.: Robust discrete code modeling for supervised hashing. Pattern Recogn. 75, 128–135 (2018)
Article Google Scholar
Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of ACM International Conference on Multimedia, pp. 251–260 (2010)
Rumelhart, D.E., Hinton, G.E., McClelland, J.L., et al.: A general framework for parallel distributed processing. Parallel Distributed Processing: Explorations in the Microstructure of Cognition 1(26), 45–76 (1986)
Google Scholar
Shen, F., Xu, Y., Liu, L., Yang, Y., Huang, Z., Shen, H.T.: Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 3034–3044 (2018)
Article Google Scholar
Shen, H.T., Jiang, S., Tan, K.L., Huang, Z., Zhou, X.: Speed up interactive image retrieval. VLDB J. 18(1), 329–343 (2009)
Article Google Scholar
Shen, H.T., Liu, L., Yang, Y., Xu, X., Huang, Z., Shen, F., Hong, R.: Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans. Knowl. Data Eng. (2020)
Song, J., Yang, Y., Huang, Z., Shen, H.T., Luo, J.: Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Trans. Multimed. 15(8), 1997–2008 (2013)
Article Google Scholar
Song, J., Yang, Y., Li, X., Huang, Z., Yang, Y.: Robust hashing with local models for approximate similarity search. IEEE Trans. Cybern. 44 (7), 1225–1236 (2014)
Article Google Scholar
Song, J., Yang, Y., Yang, Y., Huang, Z., Shen, H.T.: Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 785–796 (2013)
Srivastava, N., Salakhutdinov, R.R.: Multimodal learning with deep boltzmann machines. In: Proceedings of Advances in Neural Information Processing Systems, pp. 2222–2230 (2012)
Su, S., Zhong, Z., Zhang, C.: Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In: Proceedings of IEEE International Conference on Computer Vision, pp. 3027–3035 (2019)
Wang, B., Yang, Y., Xu, X., Hanjalic, A., Shen, H.T.: Adversarial cross-modal retrieval. In: Proceedings of ACM International Conference on Multimedia, pp. 154–162 (2017)
Wang, D., Cui, P., Ou, M., Zhu, W.: Deep multimodal hashing with orthogonal regularization. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2291–2297 (2015)
Wang, D., Wang, Q., Gao, X.: Robust and flexible discrete hashing for cross-modal similarity search. IEEE Trans. Circ. Syst. Video Technol. 28 (10), 2703–2715 (2017)
Article Google Scholar
Wang, Z., Zhang, Z., Luo, Y., Huang, Z.: Deep collaborative discrete hashing with semantic-invariant structure. In: Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 905–908 (2019)
Wang, Z., Zhang, Z., Luo, Y., Huang, Z., Shen, H.T.: Deep collaborative discrete hashing with semantic-invariant structure construction. IEEE Trans. Multimed. (2020)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1753–1760 (2009)
Wu, B., Yang, Q., Zheng, W.S., Wang, Y., Wang, J.: Quantized correlation hashing for fast cross-modal search. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 3946–3952 (2015)
Wu, G., Lin, Z., Han, J., Liu, L., Ding, G., Zhang, B., Shen, J.: Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2854–2860 (2018)
Xu, R., Li, C., Yan, J., Deng, C., Liu, X.: Graph convolutional network hashing for cross-modal retrieval. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 10–16 (2019)
Xu, X., Shen, F., Yang, Y., Shen, H.T., Li, X.: Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans. Image Process. 26 (5), 2494–2507 (2017)
Article MathSciNet Google Scholar
Yang, E., Deng, C., Liu, T., Liu, W., Tao, D.: Semantic structure-based unsupervised deep hashing. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1064–1070 (2018)
Yang, E., Deng, C., Liu, W., Liu, X., Tao, D., Gao, X.: Pairwise relationship guided deep hashing for cross-modal retrieval. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 1618–1625 (2017)
Yang, Y., Luo, Y., Chen, W., Shen, F., Shao, J., Shen, H.T.: Zero-shot hashing via transferring supervised knowledge. In: Proceedings of ACM International Conference on Multimedia, pp. 1286–1295 (2016)
Zhang, D., Li, W.J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 2177–2183 (2014)
Zhang, D., Wang, J., Cai, D., Lu, J.: Self-taught hashing for fast similarity search. In: Proceedings of ACM SIGIR International conference on Research and Development in Information Retrieval, pp. 18–25 (2010)
Zhang, P., Zhang, W., Li, W.J., Guo, M.: Supervised hashing with latent factor models. In: Proceedings of ACM SIGIR International conference on Research and Development in Information Retrieval, pp. 173–182 (2014)
Zhang, Z., Xie, G.S., Li, Y., Li, S., Huang, Z.: Sadih: Semantic-aware discrete hashing. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 5853–5860 (2019)
Zhen, Y., Yeung, D.Y.: Co-regularized hashing for multimodal data. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1376–1384 (2012)
Zhou, J., Ding, G., Guo, Y.: Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 415–424 (2014)
Zhou, X., Shen, F., Liu, L., Liu, W., Nie, L., Yang, Y., Shen, H.T.: Graph convolutional network hashing. IEEE Trans. Cybern. 1–13 (2018)
Zhu, L., Huang, Z., Liu, X., He, X., Sun, J., Zhou, X.: Discrete multimodal hashing with canonical views for robust mobile landmark search. IEEE Trans. Multimed. 19(9), 2066–2079 (2017)
Article Google Scholar
Zhu, X., Huang, Z., Cheng, H., Cui, J., Shen, H.T.: Sparse hashing for fast multimedia search. IEEE Trans. Image Process. 31(2), 1–24 (2013)
Google Scholar
Zhu, X., Huang, Z., Shen, H.T., Zhao, X.: Linear cross-modal hashing for efficient multimedia search. In: Proceedings of ACM International Conference on Multimedia, pp. 143–152 (2013)

Download references

Author information

Authors and Affiliations

School of Information Technology, Electrical Engineering, University of Queensland, Brisbane, Australia
Peng-Fei Zhang, Yadan Luo & Zi Huang
School of Software, Shandong University, Jinan, China
Xin-Shun Xu
School of Computer Science, Engineering, University of Electronic Science and Technology of China, Chengdu, China
Jingkuan Song

Authors

Peng-Fei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yadan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Zi Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xin-Shun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jingkuan Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zi Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, PF., Luo, Y., Huang, Z. et al. High-order nonlocal Hashing for unsupervised cross-modal retrieval. World Wide Web 24, 563–583 (2021). https://doi.org/10.1007/s11280-020-00859-y

Download citation

Received: 07 September 2020
Revised: 17 December 2020
Accepted: 21 December 2020
Published: 27 February 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s11280-020-00859-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-order nonlocal Hashing for unsupervised cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Deep Consistency Preserving Network for Unsupervised Cross-Modal Hashing

Multi-head Hashing with Orthogonal Decomposition for Cross-modal Retrieval

CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

High-order nonlocal Hashing for unsupervised cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Deep Consistency Preserving Network for Unsupervised Cross-Modal Hashing

Multi-head Hashing with Orthogonal Decomposition for Cross-modal Retrieval

CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation