Multiresolution Discriminative Mixup Network for Fine-Grained Visual Categorization,IEEE Transactions on Neural Networks and Learning Systems

当前位置： X-MOL 学术 › IEEE Trans. Neural Netw. Learn. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multiresolution Discriminative Mixup Network for Fine-Grained Visual Categorization
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2021-10-05 , DOI: 10.1109/tnnls.2021.3112768
Kunran Xu ₁ , Rui Lai ₁ , Lin Gu ₂ , Yishi Li ₁

Affiliation

Fine-grained visual categorization (FGVC) is a challenging task because there are many hard examples existing between fine-grained classes which differ subtly in particular local regions. To address this issue, many methods have recourse to high-resolution source images and others adopt effective regularization like “mixup” or “between class learning.” Despite their promising achievements, mixup tends to cause the manifold intrusion problem which would result in under-fitting and degradation of the model performance and high-resolution input inevitably leads to high computational costs. In view of this, we present a multiresolution discriminative mixup network (MRDMN). Different from standard mixup, the proposed discriminative mixup strategy mixes discriminative regions linearly instead of entire images to avoid manifold intrusion, which makes it learn the local detail features more effectively and contributes to more precise categorization. Furthermore, an innovative resolution-based distillation strategy is designed to transfer the multiresolution detail feature representations to a low-resolution network, which speeds up the testing and boosts the categorization accuracy simultaneously. Extensive experiments demonstrate that our proposed MRDMN remarkably outperforms most competitive approaches with less computation time on the CUB-200-2011, Stanford-Cars, Stanford-Dogs, Food-101, and iNaturalist 2017 datasets. The codes are in https://github.com/aztc/MRDMN.

中文翻译：

用于细粒度视觉分类的多分辨率判别混合网络

细粒度视觉分类（FGVC）是一项具有挑战性的任务，因为细粒度类别之间存在许多困难的例子，这些类别在特定的局部区域存在细微的差异。为了解决这个问题，许多方法都求助于高分辨率源图像，而其他方法则采用有效的正则化，例如“混合”或“类间学习”。尽管取得了令人鼓舞的成就，但混合往往会导致流形入侵问题，从而导致模型性能的欠拟合和退化，并且高分辨率输入不可避免地导致高计算成本。鉴于此，我们提出了一种多分辨率判别混合网络（MRDMN）。与标准混合不同，所提出的判别混合策略线性混合判别区域而不是整个图像，以避免流形入侵，这使得它更有效地学习局部细节特征，并有助于更精确的分类。此外，一种创新的基于分辨率的蒸馏策略旨在将多分辨率细节特征表示转移到低分辨率网络，从而加快测试速度并同时提高分类准确性。大量实验表明，我们提出的 MRDMN 在 CUB-200-2011、Stanford-Cars、Stanford-Dogs、Food-101 和 iNaturalist 2017 数据集上的计算时间更少，显着优于大多数竞争方法。代码位于 https://github.com/aztc/MRDMN。

更新日期：2021-10-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11