当前位置:
X-MOL 学术
›
arXiv.cs.CV
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Progressive Co-Attention Network for Fine-grained Visual Classification
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-21 , DOI: arxiv-2101.08527 Tian Zhang, Dongliang Chang, Zhanyu Ma, Jun Guo
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-21 , DOI: arxiv-2101.08527 Tian Zhang, Dongliang Chang, Zhanyu Ma, Jun Guo
Fine-grained visual classification aims to recognize images belonging to
multiple sub-categories within a same category. It is a challenging task due to
the inherently subtle variations among highly-confused categories. Most
existing methods only take individual image as input, which may limit the
ability of models to recognize contrastive clues from different images. In this
paper, we propose an effective method called progressive co-attention network
(PCA-Net) to tackle this problem. Specifically, we calculate the channel-wise
similarity by interacting the feature channels within same-category images to
capture the common discriminative features. Considering that complementary
imformation is also crucial for recognition, we erase the prominent areas
enhanced by the channel interaction to force the network to focus on other
discriminative regions. The proposed model can be trained in an end-to-end
manner, and only requires image-level label supervision. It has achieved
competitive results on three fine-grained visual classification benchmark
datasets: CUB-200-2011, Stanford Cars, and FGVC Aircraft.
中文翻译:
渐进式协同注意网络,用于细粒度的视觉分类
细粒度的视觉分类旨在识别属于同一类别中多个子类别的图像。由于高度混淆的类别之间存在固有的细微差异,因此这是一项具有挑战性的任务。大多数现有方法仅将单个图像作为输入,这可能会限制模型识别来自不同图像的对比线索的能力。在本文中,我们提出了一种有效的方法,称为渐进式共同注意网络(PCA-Net),以解决此问题。具体来说,我们通过在相同类别的图像中交互特征通道以捕获共同的判别特征来计算通道方向的相似性。考虑到互补信息对于识别也至关重要,因此,我们消除了通道交互增强的突出区域,以迫使网络将精力集中在其他区分区域。可以以端到端的方式训练提出的模型,并且仅需要图像级标签监控。它在三个细粒度的视觉分类基准数据集上取得了竞争性结果:CUB-200-2011,斯坦福汽车和FGVC飞机。
更新日期:2021-01-22
中文翻译:
渐进式协同注意网络,用于细粒度的视觉分类
细粒度的视觉分类旨在识别属于同一类别中多个子类别的图像。由于高度混淆的类别之间存在固有的细微差异,因此这是一项具有挑战性的任务。大多数现有方法仅将单个图像作为输入,这可能会限制模型识别来自不同图像的对比线索的能力。在本文中,我们提出了一种有效的方法,称为渐进式共同注意网络(PCA-Net),以解决此问题。具体来说,我们通过在相同类别的图像中交互特征通道以捕获共同的判别特征来计算通道方向的相似性。考虑到互补信息对于识别也至关重要,因此,我们消除了通道交互增强的突出区域,以迫使网络将精力集中在其他区分区域。可以以端到端的方式训练提出的模型,并且仅需要图像级标签监控。它在三个细粒度的视觉分类基准数据集上取得了竞争性结果:CUB-200-2011,斯坦福汽车和FGVC飞机。