当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Large Scale Category-Structured Image Retrieval For Object Identification Through Supervised Learning of CNN and SURF Based Matching
IEEE Access ( IF 3.9 ) Pub Date : 2020-01-01 , DOI: 10.1109/access.2020.2982560
Xiaoqing Li , Jiansheng Yang , Jinwen Ma

In the modern era of Internet, mobile and digital information technology, image retrieval for object identification, just as wine label retrieval from a wine bottle image, has become an important and urgent problem in artificial intelligence. In comparison with the general image retrieval, it is rather challenging because there are a huge number of object identification or brand images which are very similar and difficult to discriminate, and the number of different brand images in the given dataset changes greatly, that is, the samples are strongly unbalanced for these brands. In this paper, we propose a CNN-SURF Consecutive Filtering and Matching (CSCFM) framework for this kind of image retrieval, specifically focalizing on wine label retrieval. In particular, Convolutional Neural Network (CNN) is utilized to filter out the impossible main-brands (manufacturers) for narrowing down the range of retrieval and the Speeded Up Robust Features (SURF) matching is improved by adopting the RANdom SAmple Consensus (RANSAC) mechanism and the modified Term Frequency–Inverse Document Frequency (TF-IDF) distance for the accurate retrieval of the sub-brand (item attribute under the manufacture). The experiments are conducted on a dataset containing approximately 548k images of wine labels with 17, 328 main-brands and 260, 579 sub-brands. It is demonstrated by the experimental results that our proposed method can solve the wine label retrieval problem effectively and efficiently. Moreover, our proposed method is further evaluated on two pubic benchmarks of the object identification image retrieval tasks, Oxford Buildings Benchmark (Oxford5k) and the University of Kentucky of Indoor Things Benchmark (UKB), and achieves 88.3% mean average precision and 3.92 N-S score in Oxford5k and UKB, respectively.

中文翻译:

通过基于 CNN 和基于 SURF 的匹配的监督学习进行目标识别的大规模类别结构图像检索

在互联网、移动和数字信息技术的现代时代,用于物体识别的图像检索,就像从酒瓶图像中检索酒标一样,已成为人工智能中一个重要而紧迫的问题。与一般的图像检索相比,由于存在大量非常相似且难以区分的物体识别或品牌图像,并且给定数据集中不同品牌图像的数量变化很大,因此具有相当大的挑战性,即,这些品牌的样本严重不平衡。在本文中,我们为这种图像检索提出了一个 CNN-SURF 连续过滤和匹配 (CSCFM) 框架,特别是专注于葡萄酒标签检索。特别是,利用卷积神经网络 (CNN) 过滤掉不可能的主品牌 (制造商) 以缩小检索范围,并通过采用 RANdom Sample Consensus (RANSAC) 机制和加速鲁棒特征 (SURF) 匹配改进用于准确检索子品牌(制造商下的项目属性)的修改后的词频-逆文档频率 (TF-IDF) 距离。实验是在一个数据集上进行的,该数据集包含大约 54.8 万张葡萄酒标签图像,其中包含 17、328 个主品牌和 260、579 个子品牌。实验结果表明,我们提出的方法可以有效地解决酒标检索问题。此外,我们提出的方法在对象识别图像检索任务的两个公共基准上得到了进一步评估,
更新日期:2020-01-01
down
wechat
bug