当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Penalizing the Hard Example But Not Too Much: A Strong Baseline for Fine-Grained Visual Classification
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 11-21-2022 , DOI: 10.1109/tnnls.2022.3213563
Yuanzhi Liang 1 , Linchao Zhu 2 , Xiaohan Wang 2 , Yi Yang 2
Affiliation  

Though significant progress has been achieved on fine-grained visual classification (FGVC), severe overfitting still hinders model generalization. A recent study shows that hard samples in the training set can be easily fit, but most existing FGVC methods fail to classify some hard examples in the test set. The reason is that the model overfits those hard examples in the training set, but does not learn to generalize to unseen examples in the test set. In this article, we propose a moderate hard example modulation (MHEM) strategy to properly modulate the hard examples. MHEM encourages the model to not overfit hard examples and offers better generalization and discrimination. First, we introduce three conditions and formulate a general form of a modulated loss function. Second, we instantiate the loss function and provide a strong baseline for FGVC, where the performance of a naive backbone can be boosted and be comparable with recent methods. Moreover, we demonstrate that our baseline can be readily incorporated into the existing methods and empower these methods to be more discriminative. Equipped with our strong baseline, we achieve consistent improvements on three typical FGVC datasets, i.e., CUB-200-2011, Stanford Cars, and FGVC-Aircraft. We hope the idea of moderate hard example modulation will inspire future research work toward more effective fine-grained visual recognition.

中文翻译:


惩罚困难的例子但不要太多:细粒度视觉分类的强大基线



尽管细粒度视觉分类(FGVC)已经取得了重大进展,但严重的过度拟合仍然阻碍了模型的泛化。最近的一项研究表明,训练集中的困难样本可以很容易地拟合,但大多数现有的 FGVC 方法无法对测试集中的一些困难样本进行分类。原因是模型过度拟合了训练集中那些困难的示例,但没有学会泛化到测试集中未见过的示例。在本文中,我们提出了一种适度的困难示例调制(MHEM)策略来正确地调节困难示例。 MHEM 鼓励模型不要过度拟合困难的例子,并提供更好的泛化和区分能力。首先,我们引入三个条件并制定调制损失函数的一般形式。其次,我们实例化损失函数并为 FGVC 提供强大的基线,其中朴素骨干网的性能可以得到提升并与最近的方法相当。此外,我们证明我们的基线可以很容易地纳入现有方法中,并使这些方法更具辨别力。凭借我们强大的基线,我们在三个典型的 FGVC 数据集(即 CUB-200-2011、Stanford Cars 和 FGVC-Aircraft)上实现了一致的改进。我们希望适度的硬示例调制的想法能够激发未来的研究工作,以实现更有效的细粒度视觉识别。
更新日期:2024-08-26
down
wechat
bug