当前位置: X-MOL 学术Comput. Vis. Image Underst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning to locate for fine-grained image recognition
Computer Vision and Image Understanding ( IF 4.3 ) Pub Date : 2021-02-16 , DOI: 10.1016/j.cviu.2021.103184
Jiamin Chen , Jianguo Hu , Shiren Li

In this paper, we propose an end-to-end weakly supervised method for fine-grained image recognition called bounding box-part location method(BBPL), which can locate the object and part precisely without part annotations. The proposed method includes three modules: object detection, ObjectMask, and classification. Firstly, the object detection module predicts the bounding boxes, and the predicted bounding boxes are employed to generate a mask through ObjectMask module. The generated mask can suppress the background interference during recognition. Secondly, the classification module can be further divided into two branches, which are global feature classification and local feature classification. In global feature classification branch, global feature is extracted to get global classification result. While in local feature classification branch, salient point is first detected through our novel salient point detection module, which can greatly reduce the consuming-time compared with the most existing local feature extraction methods. Further, the local feature is extracted in these detected salient points, and local classification result is obtained by local feature classification branch. Finally, we get the final result by fusing the results of two classification branches together. With experiments on three widely used fine-grained image recognition datasets (CUB-200-2011, Stanford Cars, Stanford Dogs), our method can achieve the state-of-the-art performance.



中文翻译:

学习定位以进行细粒度的图像识别

在本文中,我们提出了一种用于细粒度图像识别的端到端弱监督方法,称为边界框部分定位方法(BBPL),该方法可以精确地定位对象和部分,而无需部分注释。所提出的方法包括三个模块:对象检测,ObjectMask和分类。首先,物体检测模块对边界框进行预测,然后利用预测的边界框通过ObjectMask模块生成遮罩。生成的遮罩可以抑制识别过程中的背景干扰。其次,分类模块可以进一步分为全局特征分类和局部特征分类两个分支。在全局特征分类分支中,提取全局特征以获得全局分类结果。在本地要素分类分支中,首先通过我们新颖的凸点检测模块检测凸点,与大多数现有的局部特征提取方法相比,它可以大大减少耗时。此外,在这些检测到的显着点中提取局部特征,并且通过局部特征分类分支获得局部分类结果。最后,通过将两个分类分支的结果融合在一起来获得最终结果。通过对三个广泛使用的细粒度图像识别数据集(CUB-200-2011,Stanford Cars,Stanford Dogs)进行实验,我们的方法可以达到最先进的性能。在这些检测到的显着点中提取局部特征,并通过局部特征分类分支获得局部分类结果。最后,我们将两个分类分支的结果融合在一起,以获得最终结果。通过对三个广泛使用的细粒度图像识别数据集(CUB-200-2011,Stanford Cars,Stanford Dogs)进行实验,我们的方法可以达到最先进的性能。在这些检测到的显着点中提取局部特征,并通过局部特征分类分支获得局部分类结果。最后,我们将两个分类分支的结果融合在一起,以获得最终结果。通过对三个广泛使用的细粒度图像识别数据集(CUB-200-2011,Stanford Cars,Stanford Dogs)进行实验,我们的方法可以达到最先进的性能。

更新日期:2021-02-26
down
wechat
bug