Zero-Shot Learning to Index on Semantic Trees for Scalable Image Retrieval,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Zero-Shot Learning to Index on Semantic Trees for Scalable Image Retrieval
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2020-11-16 , DOI: 10.1109/tip.2020.3036779
Shichao Kan , Yi Cen , Yigang Cen , Mladenovic Vladimir , Yang Li , Zhihai He

In this study, we develop a new approach, called zero-shot learning to index on semantic trees (LTI-ST), for efficient image indexing and scalable image retrieval. Our method learns to model the inherent correlation structure between visual representations using a binary semantic tree from training images which can be effectively transferred to new test images from unknown classes. Based on predicted correlation structure, we construct an efficient indexing scheme for the whole test image set. Unlike existing image index methods, our proposed LTI-ST method has the following two unique characteristics. First, it does not need to analyze the test images in the query database to construct the index structure. Instead, it is directly predicted by a network learnt from the training set. This zero-shot capability is critical for flexible, distributed, and scalable implementation and deployment of the image indexing and retrieval services at large scales. Second, unlike the existing distance-based index methods, our index structure is learnt using the LTI-ST deep neural network with binary encoding and decoding on a hierarchical semantic tree. Our extensive experimental results on benchmark datasets and ablation studies demonstrate that the proposed LTI-ST method outperforms existing index methods by a large margin while providing the above new capabilities which are highly desirable in practice.

中文翻译：

零射学习在语义树上建立索引以进行可伸缩图像检索

在这项研究中，我们开发了一种新方法，称为零镜头学习以索引语义树（LTI-ST），以实现高效的图像索引和可伸缩的图像检索。我们的方法从训练图像中学习使用二叉语义树对视觉表示之间的固有相关性结构建模，可以将其有效地转移到未知类的新测试图像中。基于预测的相关结构，我们为整个测试图像集构建了有效的索引方案。与现有的图像索引方法不同，我们提出的LTI-ST方法具有以下两个独特的特征。首先，它不需要分析查询数据库中的测试图像即可构建索引结构。相反，它是由从训练集中学习到的网络直接预测的。零击功能对于灵活，分布式，大规模扩展和实施图像索引和检索服务。其次，与现有的基于距离的索引方法不同，我们的索引结构是使用LTI-ST深度神经网络通过在分层语义树上进行二进制编码和解码来学习的。我们在基准数据集和消融研究上的大量实验结果表明，所提出的LTI-ST方法在很大程度上提供了优于现有索引方法的性能，同时还提供了在实践中非常需要的上述新功能。

更新日期：2020-11-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11