当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Weakly supervised fine-grained recognition based on spatial-channel aware attention filters
Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2021-01-23 , DOI: 10.1007/s11042-020-10268-y
Nannan Yu , Lei Huang , Zhiqiang Wei , Wenfeng Zhang , Bin Wang

Fine-grained recognition is a very challenging issue, since it is difficulty to mine discriminative and subtle feature for objects with similar visual appearance. Because massive manual annotations (e.g., bounding box for discriminative regions) are time-consuming and labor-consuming, existing methods designed single form of attention model outputted discriminative regions in a weakly supervised way. In this paper, we proposed a novel method named a Spatial-Channel Aware Attention Filters (SCAF) to address the weakly supervised fine-grained recognition problem. Compared with the previous attention models, SCAF can obtain attentions-aware features from two dimensions, i.e., the spatial location of image and the channel of feature maps. With the proposed SCAF, the model can enhance the discriminative regions on both spatial and channel dimensions simultaneously. In addition, the multi-channel network multi-level structure are designed to extract richer regional features. Moreover, focal loss is introduced to balance the samples’ distribution of fine-grained image dataset. Comprehensive and comparable experiments are conducted in publicly available datasets, and the experimental results show that our method can achieve the state-of-the-art performance on fine-grained recognition tasks. For instance, we achieve 99.370%, 80.749% accuracy on two underwater datasets respectively, i.e., Fish4Knowlege and Wild Fish.



中文翻译:

基于空间通道感知注意力过滤器的弱监督细粒度识别

细粒度的识别是一个非常具有挑战性的问题,因为很难为具有相似视觉外观的对象挖掘区分和细微的特征。因为大量的人工注释(例如,区分区域的边界框)既费时又费力,所以现有的设计单一注意模型的方法以弱监督的方式输出了区分区域。在本文中,我们提出了一种新的方法,称为空间通道感知注意过滤器(SCAF),以解决弱监督的细粒度识别问题。与以前的注意力模型相比,SCAF可以从二维空间获得注意力感知特征,即图像的空间位置和特征图的通道。通过拟议的SCAF,该模型可以同时增强在空间和渠道维度上的区分区域。此外,多渠道网络的多层结构旨在提取更丰富的区域特征。此外,引入了焦点损失以平衡细粒度图像数据集的样本分布。在公开可用的数据集中进行了全面且可比较的实验,实验结果表明,我们的方法可以在细粒度的识别任务上达到最先进的性能。例如,我们分别在两个Fish4Knowlege和Wild Fish水下数据集上实现了99.370%和80.749%的精度。引入焦点损失以平衡细粒度图像数据集的样本分布。在公开可用的数据集中进行了全面且可比较的实验,实验结果表明,我们的方法可以在细粒度的识别任务上达到最先进的性能。例如,我们分别在两个Fish4Knowlege和Wild Fish水下数据集上实现了99.370%和80.749%的精度。引入焦点损失以平衡细粒度图像数据集的样本分布。在公开可用的数据集中进行了全面且可比较的实验,实验结果表明,我们的方法可以在细粒度的识别任务上达到最先进的性能。例如,我们分别在两个Fish4Knowlege和Wild Fish水下数据集上实现了99.370%和80.749%的精度。

更新日期:2021-01-24
down
wechat
bug