当前位置: X-MOL 学术IEEE Trans. Circ. Syst. Video Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CDbin: Compact Discriminative Binary Descriptor Learned with Ef?cient Neural Network
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.4 ) Pub Date : 2020-03-01 , DOI: 10.1109/tcsvt.2019.2896095
Jianming Ye , Shiliang Zhang , Tiejun Huang , Yong Rui

As an important computer vision task, image matching requires efficient and discriminative local descriptors. Most of the existing descriptors like SIFT and ORB are hand-crafted; therefore it is necessary to study more optimized descriptors through end-to-end learning. This paper proposes the compact binary descriptors learned with a lightweight Convolutional Neural Network (CNN), which is efficient for training and testing. Specifically, we propose a CNN with no larger than five layers for descriptor learning. The resulting descriptors, i.e., Compact Discriminative binary descriptors (CDbin) are optimized with four complementary loss functions, i.e., 1) triplet loss to ensure the discriminative power; 2) quantization loss to decrease the quantization error; 3) correlation loss to ensure the feature compactness; and 4) even-distribution loss to enrich the embedded information. The extensive experiments on two image patch datasets and three image retrieval datasets show that the CDbin exhibits competitive performance compared with the existing descriptors. For example, the 64-bit CDbin substantially outperforms the 256-bit ORB and 1024-bit SIFT on Hpatches dataset. Although generated by a shallow CNN, CDbin also outperforms several recent deep descriptors.

中文翻译:

CDbin:使用高效神经网络学习的紧凑判别式二进制描述符

作为一项重要的计算机视觉任务,图像匹配需要高效且具有判别力的局部描述符。大多数现有的描述符,如 SIFT 和 ORB 都是手工制作的;因此有必要通过端到端的学习来研究更多优化的描述符。本文提出了使用轻量级卷积神经网络 (CNN) 学习的紧凑二进制描述符,该描述符对于训练和测试非常有效。具体来说,我们提出了一个不超过五层的 CNN 用于描述符学习。得到的描述符,即 Compact Discriminative binary descriptors (CDbin) 用四个互补的损失函数进行了优化,即 1) 三元组损失以确保判别能力;2) 量化损失以减少量化误差;3)相关损失,保证特征的紧凑性;4) 均匀分布损失以丰富嵌入信息。在两个图像补丁数据集和三个图像检索数据集上进行的大量实验表明,与现有描述符相比,CDbin 表现出具有竞争力的性能。例如,64 位 CDbin 在 Hpatches 数据集上的性能明显优于 256 位 ORB 和 1024 位 SIFT。尽管由浅层 CNN 生成,但 CDbin 也优于最近的几个深度描述符。
更新日期:2020-03-01
down
wechat
bug