Information Sciences Pub Date : 2021-05-26 , DOI: 10.1016/j.ins.2021.05.040 Xiaozhang Liu , Lifeng Zhang , Tao Li , Dejian Wang , Zhaojie Wang
For the classification of fine-grained images, the subtle differences among the subclasses of the main category must be distinguished. Intuitively, the key to realizing the fine-grained image categorization lies in locating and identifying the detailed differences in the local regions and capturing their feature representations. In this paper, we propose utilizing an attention module combined with a multi-scale latent representation network to locate the discriminative spatial regions, and then learn an accurate attention map to assist the category decision. Furthermore, an attention module is also employed to determine the channel weights of the distinct scale feature maps before the final step. Extensive experiments demonstrate that our model obtains a competitive performance against state-of-the-art baselines on two benchmark datasets, the attention validation experiments further reveal the ability of the model in choosing the proper channel features for low-quality image categorization.
中文翻译:
用于细粒度图像分类的双注意力引导多尺度 CNN
对于细粒度图像的分类,必须区分主类别的子类之间的细微差异。直观地,实现细粒度图像分类的关键在于定位和识别局部区域的细节差异并捕获其特征表示。在本文中,我们建议利用注意力模块结合多尺度潜在表示网络来定位有区别的空间区域,然后学习准确的注意力图来辅助类别决策。此外,在最后一步之前,还使用了一个注意力模块来确定不同尺度特征图的通道权重。大量实验表明,我们的模型在两个基准数据集上与最先进的基线相比获得了有竞争力的性能,