细化视觉分类的功能增强，抑制和多样化,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

细化视觉分类的功能增强，抑制和多样化
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-03-04 , DOI: arxiv-2103.02782
Jianwei Song, Ruoyu Yang

从有区别的局部区域学习特征表示在细粒度的视觉分类中起着关键作用。利用注意力机制提取零件特征已成为一种趋势。但是，这些方法有两个主要局限性：首先，它们通常将重点放在最突出的部分上，而忽略了其他不起眼但又可区分的部分。其次，他们在忽略关系的同时孤立地对待不同的零件特征。为了解决这些局限性，我们建议定位多个不同的可区分部分，并以显式方式探索它们之间的关系。为此，我们引入了两个轻量级模块，可以轻松地将它们插入现有的卷积神经网络中。一方面，我们引入了特征增强和抑制模块，该模块增强了特征图的最显着部分以获得特定于零件的表示，并抑制了该模型以迫使随后的网络挖掘其他潜在零件。另一方面，我们引入了一个功能多样化模块，该模块从相关的零件特定表示中学习语义上的补充信息。我们的方法不需要边界框/零件注释，并且可以端到端进行训练。大量的实验结果表明，我们的方法在几个基准细粒度数据集上均达到了最先进的性能。我们引入了一个功能多样化模块，该模块从相关的零件特定表示中学习语义上的补充信息。我们的方法不需要边界框/零件注释，并且可以端到端进行训练。大量的实验结果表明，我们的方法在几个基准细粒度数据集上均达到了最先进的性能。我们引入了一个功能多样化模块，该模块从相关的零件特定表示中学习语义上的补充信息。我们的方法不需要边界框/零件注释，并且可以端到端进行训练。大量的实验结果表明，我们的方法在几个基准细粒度数据集上均达到了最先进的性能。

"点击查看英文标题和摘要"

Feature Boosting, Suppression, and Diversification for Fine-Grained Visual Classification

Learning feature representation from discriminative local regions plays a key role in fine-grained visual classification. Employing attention mechanisms to extract part features has become a trend. However, there are two major limitations in these methods: First, they often focus on the most salient part while neglecting other inconspicuous but distinguishable parts. Second, they treat different part features in isolation while neglecting their relationships. To handle these limitations, we propose to locate multiple different distinguishable parts and explore their relationships in an explicit way. In this pursuit, we introduce two lightweight modules that can be easily plugged into existing convolutional neural networks. On one hand, we introduce a feature boosting and suppression module that boosts the most salient part of feature maps to obtain a part-specific representation and suppresses it to force the following network to mine other potential parts. On the other hand, we introduce a feature diversification module that learns semantically complementary information from the correlated part-specific representations. Our method does not need bounding boxes/part annotations and can be trained end-to-end. Extensive experimental results show that our method achieves state-of-the-art performances on several benchmark fine-grained datasets.

更新日期：2021-03-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文