Concatenation Hashing: a Relative Position Preserving Method for Learning Binary Codes,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Concatenation Hashing: a Relative Position Preserving Method for Learning Binary Codes
Pattern Recognition ( IF 7.5 ) Pub Date : 2020-04-01 , DOI: 10.1016/j.patcog.2019.107151
Zhenyu Weng , Yuesheng Zhu

Abstract Hashing methods perform the efficient nearest neighbor search by mapping high-dimensional data to binary codes. Compared to projection-based hashing methods, hashing methods that adopt the clustering technique can encode the complex relationship of the data into binary codes. However, their search performance is affected by the boundary of the cluster. Two similar data points may be assigned to two different clusters and then encoded into two much different binary codes. In this paper, we propose a new hashing method based on the clustering technique and it can alleviate the effect from the cluster boundary. It is from an observation that the relative positions of any two close data points to each cluster center are close. An alternating optimization is developed to simultaneously discover the cluster structures of the data and learn the hash functions to preserve the relative positions of the data to each cluster center. To integrate the information in each cluster, the corresponding binary code of each data point is obtained by concatenating the substrings learnt by the hash functions in each cluster. The experiments show that our method is competitive to or better than the state-of-the-art hashing methods.

中文翻译：

Concatenation Hashing：一种学习二进制代码的相对位置保留方法

摘要散列方法通过将高维数据映射到二进制代码来执行高效的最近邻搜索。与基于投影的散列方法相比，采用聚类技术的散列方法可以将数据的复杂关系编码为二进制代码。但是，它们的搜索性能受集群边界的影响。两个相似的数据点可以分配给两个不同的集群，然后编码成两个截然不同的二进制代码。在本文中，我们提出了一种基于聚类技术的新散列方法，它可以减轻来自聚类边界的影响。从观察来看，任意两个相近的数据点与每个聚类中心的相对位置都是相近的。开发了交替优化以同时发现数据的集群结构并学习哈希函数以保留数据与每个集群中心的相对位置。为了整合每个簇中的信息，通过连接每个簇中哈希函数学习的子串，得到每个数据点对应的二进制代码。实验表明，我们的方法与最先进的散列方法相比具有竞争力或更好。

更新日期：2020-04-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11