Concatenation hashing: A relative position preserving method for learning binary codes

doi:10.1016/j.patcog.2019.107151

Pattern Recognition

Volume 100, April 2020, 107151

https://doi.org/10.1016/j.patcog.2019.107151 Get rights and content

Highlights

•
By employing the clustering technique and concatenating the substrings learnt by the hash functions in each cluster, the proposed method can model the complex relationship among the data and alleviate the effect brought from the boundary of the cluster.
•
An alternating optimization is developed to simultaneously discover the cluster structures of the data and learn the hash functions to preserve the relative positions of the data to each cluster center.
•
The experiments show that the proposed method is competitive to or better than other unsupervised hashing methods. Especially when learning the long codes in order to achieve the high search precision, the proposed method is obviously superior to the other methods.

Abstract

Hashing methods perform the efficient nearest neighbor search by mapping high-dimensional data to binary codes. Compared to projection-based hashing methods, hashing methods that adopt the clustering technique can encode the complex relationship of the data into binary codes. However, their search performance is affected by the boundary of the cluster. Two similar data points may be assigned to two different clusters and then encoded into two much different binary codes. In this paper, we propose a new hashing method based on the clustering technique and it can alleviate the effect from the cluster boundary. It is from an observation that the relative positions of any two close data points to each cluster center are close. An alternating optimization is developed to simultaneously discover the cluster structures of the data and learn the hash functions to preserve the relative positions of the data to each cluster center. To integrate the information in each cluster, the corresponding binary code of each data point is obtained by concatenating the substrings learnt by the hash functions in each cluster. The experiments show that our method is competitive to or better than the state-of-the-art hashing methods.

Introduction

Nearest neighbor search is often used in machine learning and computer vision applications. However, with the development of the feature representations, images and videos are represented by the high-dimensional feature vectors, and conventional nearest neighbor search methods [1] cannot deal with the high-dimensional data. To solve this problem, recently, hashing methods are used to perform the approximate nearest neighbor (ANN) search efficiently [2], [3], [4], [5]. By mapping the high-dimensional data to the binary codes and using the Hamming distance to represent the data similarity, hashing methods can perform the ANN search on the large-scale dataset with low storage and efficient computation.

The representatives of the hashing methods are locality-sensitive hashing (LSH) [6] and its variants [7], which are data-independent hashing methods. In the LSH method, the hyperplanes are randomly generated to map the similar data to the similar binary codes in a probability. Since the data-dependent hashing methods usually have a better search accuracy than the data-independent hashing methods for ANN search, data-dependent hashing methods [8], [9], [10], [11] have become increasingly popular. Data-dependent hashing methods can be categorized as unsupervised hashing methods [12], [13] and supervised hashing methods [14], [15], [16].

By combing with supervised information, supervised hashing methods learn the hash functions to preserve the sematic similarity of the data [17], [18], [19], [20]. In contrast, unsupervised hashing methods learn the hash functions without the supervised information. They generate the binary codes to preserve the distribution of the data in the Euclidean space [21], [22], [23]. Since it usually needs a lot of manual labor to obtain the supervised information, we focus on the unsupervised hashing methods in this paper.

It has been proved that directly learning the best binary codes from the data is an NP-hard problem [24]. To avoid this problem, most of hashing methods adopt a two-stage strategy, projection stage and quantization stage [8], [24]. In the projection stage, the data are projected into a low-dimensional space by the projection functions. And in the quantization stage, the projected data are quantized to the binary codes by the sign function or other quantization functions. Spectral Hashing (SH) [24] constructs a graph to describe the relationship between the data and learns the projection hyperplanes from the graph to project the data into a low-dimensional space. However, this method cannot scale to the large-scale data. To address the scalability issue, other graph-based hashing methods [13], [25] approximate a graph by using a subset of the data, but they still confront with the out-of-example problem. With the development of the neural networks, some hashing methods [26], [27] adopt the neural networks to project the data into a low-dimensional space and learn the binary codes by quantizing the projected values. However, they are usually time-consuming and cannot be applied on the mobile devices.

Principle component analysis (PCA) [8] can learn the limited projection hyperplanes from the data to maximally preserve the data information and has a generalization ability to the unseen data. Hence, some hashing methods [8], [28] generate the projection hyperplanes by PCA in the projection stage, and rotate the hyperplanes to minimize the quantization error between the PCA-projected data and the corresponding binary codes in the quantization stage. Although these PCA-based methods can preserve the global structure of the data, they ignore the local neighborhood structure of the data. In Fig. 1(a), data points a, b, c and d are on a line through the origin. No matter how to rotate the hyperplanes around the origin, a and b cannot be separated. Neither can c and d.

Clustering-based hashing methods learn the hash functions by employing the clustering technique to model the complex relationship among the data. In spherical hashing (SPH) [29], each bit of the binary code is generated by using a hypersphere-based hash function to group the spatially coherent data points. For K-means hashing (KMH) [12], it discovers the clusters of the data and learns the binary codes for the indices of the clusters. Since the data points in the same cluster are given the same binary codes, two similar data points may be assigned to two different clusters and then encoded into two different binary codes. As shown in Fig. 1(b), although data points b and c are closer than c and d, c and d are in the same cluster while b and c are from different clusters. The boundary of the cluster affects the performance of KMH.

Intuitively, if two data points are close to each other, their relative positions to each cluster center are close. Following this intuition, we propose a new hashing method, concatenation hashing (CH), to learn the binary code for the data point by concatenating the substrings learnt based on its relative positions to the cluster centers. The proposed method simultaneously performs the clustering technique and learns the hyperplane-based hash functions to preserve the relative position information of the data in each cluster. Hence, if two data points are close to each other, their substrings in each cluster should be similar. As shown in Fig. 1(c), b and c are closer than any other two data points and they are in the same side of each hyperplane which is generated based on the corresponding cluster center as the origin. The contributions in this paper are listed as follows:

•
By employing the clustering technique and concatenating the substrings learnt by the hash functions in each cluster, the proposed method can model the complex relationship among the data and alleviate the effect brought from the boundary of the cluster.
•
An alternating optimization is developed to simultaneously discover the cluster structures of the data and learn the hash functions to preserve the relative positions of the data to each cluster center.
•
The experiments show that the proposed method is competitive to or better than other unsupervised hashing methods. Especially when learning the long codes in order to achieve the high search precision, the proposed method is obviously superior to the other methods.

Section snippets

Objective function

Assume there is a set of N data points { $x_{1}, x_{2}, \dots, x_{N}$ }, x_i ∈ $R^{D},$ forming the columns of the data matrix X ∈ $R^{D * N}$ . The goal of the hashing method is to learn the corresponding binary codes { $y_{1}, y_{2}, \dots, y_{N}$ }, y_i ∈ {0, 1}^K, forming the columns of the binary code matrix Y ∈ ${- 1, 1}^{K * N},$ where K denotes the length of the binary code. As mentioned in Kong et al. [30], since directly learning the binary codes from the data is an NP-hard problem, most of the hashing methods [8], [13], [29] adopt a two-stage

Datasets and evaluation protocols

The experiments are performed on the following three datasets.

(a)
CIFAR-10 dataset [36] CIFAR-10 is a set of 60,000 32 × 32 images, each of which is represented by a 512-dimensional GIST feature [37]. 10,000 images are randomly selected as the queries, and the rest are used for training and searching.
(b)
MNIST dataset [38] MNIST is a set of handwritten digits, which has a training set of 60,000 images, and a test set of 10,000 images. Each image is represented by a 800-dimensional feature vector

Conclusion and future work

In this paper, we propose a new hashing method to simultaneously cluster the training data and learn the hash functions in each cluster. The corresponding binary code of the data point is obtained by concatenating the substrings from each cluster. By clustering the data and integrating the information from each cluster, our method can handle the data with complex distribution and alleviate the effect brought from the boundary of the cluster. Further, to minimize the quantization error between

Declaration of competing interest

We declare that we have no conflict of interest.

Acknowledgement

This work was supported in part by the Shenzhen Municipal Development and Reform Commission (Disciplinary Development Program for Data Science and Intelligent Computing), in part by Shenzhen International cooperative research projects GJHZ20170313150021171, and in part by NSFC-Shenzhen Robot Jointed Founding (U1613215).

Zhenyu Weng is a Ph.D. student in School of Electronics Engineering and Computer Science, Peking University. He received his B.S. degree in Computer Science from Sun Yat-sen University in 2013. His research interests include computer vision, machine learning and multimedia information retrieval.

References (48)

C. Li et al.
SCRATCH: A scalable discrete matrix factorization hashing for cross-modal retrieval
ACM on Multimedia Conference
(2018)
J. Song et al.
Quantization-based hashing: a general framework for scalable image and video retrieval
Pattern Recognition
(2018)
X. Bai et al.
Adaptive hash retrieval with kernel based similarity
Pattern Recognit.
(2018)
Y. Cui et al.
Supervised discrete discriminant hashing for image retrieval
Pattern Recognit.
(2018)
Q. Ma et al.
Supervised learning based discrete hashing for image retrieval
Pattern Recognit.
(2019)
X. Xu et al.
An improved density peaks clustering algorithm with fast finding cluster centers
Knowl.-Based Syst.
(2018)
S. Ding et al.
A multiway p-spectral clustering algorithm
Knowl.-Based Syst.
(2019)
C. Silpa-Anan et al.
Optimised kd-trees for fast image descriptor matching
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2008)
G. Lin et al.
A general two-step approach to learning-based hashing
Proceedings of the IEEE International Conference on Computer Vision
(2013)
A. Gordo et al.
Asymmetric distances for binary embeddings
IEEE Trans. Pattern Anal. Mach. Intell.
(2014)

A. Andoni et al.

Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

47th Annual IEEE Symposium on Foundations of Computer Science

(2006)

B. Kulis et al.

Kernelized locality-sensitive hashing

IEEE Trans. Pattern Anal. Mach. Intell.

(2012)

Y. Gong et al.

Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval

IEEE Trans. Pattern Anal. Mach. Intell.

(2013)

V.E. Liong et al.

Deep hashing for compact binary codes learning

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2015)

J. Wang et al.

A survey on learning to hash

IEEE Trans. Pattern Anal. Mach. Intell.

(2018)

X. Liu et al.

Distributed adaptive binary quantization for fast nearest neighbor search

IEEE Trans. Image Process.

(2017)

K. He et al.

K-means hashing: an affinity-preserving quantization method for learning binary compact codes

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2013)

X. Li et al.

Large graph hashing with spectral rotation

Association for the Advancement of Artificial Intelligence

(2017)

W.-C. Kang et al.

Column sampling based discrete supervised hashing

Association for the Advancement of Artificial Intelligence

(2016)

Q. Li et al.

Deep supervised discrete hashing

Advances in Neural Information Processing Systems

(2017)

Z. Cao et al.

Deep priority hashing

ACM on Multimedia Conference

(2018)

W. Liu et al.

Supervised hashing with kernels

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2012)

W. Liu et al.

Discrete graph hashing

Advances in Neural Information Processing Systems

(2014)

R. Ji et al.

Toward optimal manifold hashing via discrete locally linear embedding

IEEE Trans. Image Process.

(2017)

Cited by (6)

An optimized deep supervised hashing model for fast image retrieval
2023, Image and Vision Computing
As multimedia data grows exponentially, searching for and retrieving a relevant image is becoming a challenge for researchers. Hashing is a widely adopted method because of its high performance in image retrieval with deep neural networks and multiple convolutional layers. Even so, most hashing methods ignore the computational cost and memory storage consumption. When the deep hashing model size is large, it leads to a slowdown in the response time of the model compared to the small model. Addressing these issues, a novel optimized deep supervised hashing based on a teacher-student approach for swift and precise image retrieval is proposed in this paper. In this work, the small student model is trained using the knowledge distillation from the large teacher model and the information from the one-hot labels. Therefore, a weight allocation loss function based on the teacher and student models is defined. Meanwhile, we apply model pruning to decrease the amount of the student model further to increase the response time. Therefore, knowledge distillation is performed on the pruned model. After that, the remaining weights are quantized to reach the smaller size of the model. Extensive experimental outcomes on two widely used datasets prove the outstanding efficiency of our proposed method.
Binary Representation via Jointly Personalized Sparse Hashing
2022, ACM Transactions on Multimedia Computing, Communications and Applications
Binary Representation via Jointly Personalized Sparse Hashing
2022, arXiv
Hashing Learning with Hyper-Class Representation
2022, arXiv
Online hashing with similarity learning
2021, arXiv
Fast search on binary codes by weighted hamming distance
2020, arXiv

Yuesheng Zhu received his B.Eng. degree in radio engineering, M. Eng. degree in circuits and systems and Ph.D. degree in electronics engineering in 1982, 1989 and 1996, respectively. He is currently working as a professor at the Lab of Communication and Information Security, Shenzhen Graduate School, Peking University. He is a senior member of IEEE, fellow of China Institute of Electronics, and senior member of China Institute of Communications. His interests include digital signal processing, multimedia technology, communication and information security.

View full text

Concatenation hashing: A relative position preserving method for learning binary codes

Highlights

Abstract

Introduction

Section snippets

Objective function

Datasets and evaluation protocols

Conclusion and future work

Declaration of competing interest

Acknowledgement

Pattern Recognition

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Knowl.-Based Syst.

Knowl.-Based Syst.

Optimised kd-trees for fast image descriptor matching

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

A general two-step approach to learning-based hashing

Proceedings of the IEEE International Conference on Computer Vision

Asymmetric distances for binary embeddings

IEEE Trans. Pattern Anal. Mach. Intell.

Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

47th Annual IEEE Symposium on Foundations of Computer Science

Kernelized locality-sensitive hashing

IEEE Trans. Pattern Anal. Mach. Intell.

Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval

IEEE Trans. Pattern Anal. Mach. Intell.

Deep hashing for compact binary codes learning

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

A survey on learning to hash

IEEE Trans. Pattern Anal. Mach. Intell.

Distributed adaptive binary quantization for fast nearest neighbor search

IEEE Trans. Image Process.

K-means hashing: an affinity-preserving quantization method for learning binary compact codes

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Large graph hashing with spectral rotation

Association for the Advancement of Artificial Intelligence

Column sampling based discrete supervised hashing

Association for the Advancement of Artificial Intelligence

Deep supervised discrete hashing

Advances in Neural Information Processing Systems

Deep priority hashing

ACM on Multimedia Conference

Supervised hashing with kernels

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Discrete graph hashing

Advances in Neural Information Processing Systems

Toward optimal manifold hashing via discrete locally linear embedding

IEEE Trans. Image Process.