Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2020-03-06 , DOI: 10.1109/tip.2020.2977457
Shaobo Min , Hantao Yao , Hongtao Xie , Zheng-Jun Zha , Yongdong Zhang

Bilinear pooling achieves great success in fine-grained visual recognition (FGVC). Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features, but some problems, e.g., redundant information and over-fitting, remain to be resolved. In this paper, we propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity. These three regularizers can not only stabilize the second-order information, but also compact the bilinear features and promote model generalization. In MOMN, a core challenge is how to jointly optimize three non-smooth regularizers of different convex properties. To this end, MOMN first formulates them into an augmented Lagrange formula with approximated regularizer constraints. Then, auxiliary variables are introduced to relax different constraints, which allow each regularizer to be solved alternately. Finally, several updating strategies based on gradient descent are designed to obtain consistent convergence and efficient implementation. Consequently, MOMN is implemented with only matrix multiplication, which is well-compatible with GPU acceleration, and the normalized bilinear features are stabilized and discriminative. Experiments on five public benchmarks for FGVC demonstrate that the proposed MOMN is superior to existing normalization-based methods in terms of both accuracy and efficiency. The code is available: https://github.com/mboboGO/MOMN.

中文翻译：

细粒度视觉识别的多目标矩阵归一化

双线性池在细粒度视觉识别（FGVC）中取得了巨大的成功。最近的方法表明，矩阵功率归一化可以使二阶信息稳定在双线性特征中，但是一些问题，例如冗余信息和过度拟合，仍有待解决。在本文中，我们提出了一种有效的多目标矩阵归一化（MOMN）方法，该方法可以同时根据平方根，低秩和稀疏性对双线性表示进行归一化。这三个正则化器不仅可以稳定二阶信息，而且可以压缩双线性特征并促进模型泛化。在MOMN中，一个核心挑战是如何共同优化三个具有不同凸特性的非平滑正则器。为此，MOMN首先将它们公式化为具有近似正则化约束的增强型Lagrange公式。然后，引入辅助变量以放宽不同的约束，从而使每个正则化函数可以交替求解。最后，设计了几种基于梯度下降的更新策略以获得一致的收敛性和有效的实现。因此，仅使用矩阵乘法即可实现MOMN，该矩阵乘法与GPU加速具有良好的兼容性，并且归一化的双线性特征是稳定且可区分的。针对FGVC的五个公开基准进行的实验表明，所提出的MOMN在准确性和效率方面均优于现有的基于归一化的方法。该代码可用：https：//github.com/mboboGO/MOMN。引入了辅助变量以放宽不同的约束，从而使每个正则化函数可以交替求解。最后，设计了几种基于梯度下降的更新策略以获得一致的收敛性和有效的实现。因此，仅使用矩阵乘法即可实现MOMN，该矩阵乘法与GPU加速具有良好的兼容性，并且归一化的双线性特征是稳定且可区分的。针对FGVC的五个公开基准进行的实验表明，所提出的MOMN在准确性和效率方面均优于现有的基于归一化的方法。该代码可用：https：//github.com/mboboGO/MOMN。引入了辅助变量以放宽不同的约束，从而使每个正则化函数可以交替求解。最后，设计了几种基于梯度下降的更新策略以获得一致的收敛性和有效的实现。因此，仅使用矩阵乘法即可实现MOMN，该矩阵乘法与GPU加速具有良好的兼容性，并且归一化的双线性特征是稳定且可区分的。针对FGVC的五个公开基准进行的实验表明，所提出的MOMN在准确性和效率方面均优于现有的基于归一化的方法。该代码可用：https：//github.com/mboboGO/MOMN。设计了几种基于梯度下降的更新策略以获得一致的收敛性和有效的实现。因此，仅使用矩阵乘法即可实现MOMN，该矩阵乘法与GPU加速具有良好的兼容性，并且归一化的双线性特征是稳定且可区分的。针对FGVC的五个公开基准进行的实验表明，所提出的MOMN在准确性和效率方面均优于现有的基于归一化的方法。该代码可用：https：//github.com/mboboGO/MOMN。设计了几种基于梯度下降的更新策略以获得一致的收敛性和有效的实现。因此，仅使用矩阵乘法即可实现MOMN，该矩阵乘法与GPU加速具有良好的兼容性，并且归一化的双线性特征是稳定且可区分的。针对FGVC的五个公开基准进行的实验表明，所提出的MOMN在准确性和效率方面均优于现有的基于归一化的方法。该代码可用：https：//github.com/mboboGO/MOMN。针对FGVC的五个公开基准进行的实验表明，所提出的MOMN在准确性和效率方面均优于现有的基于归一化的方法。该代码可用：https：//github.com/mboboGO/MOMN。针对FGVC的五个公开基准进行的实验表明，所提出的MOMN在准确性和效率方面均优于现有的基于归一化的方法。该代码可用：https：//github.com/mboboGO/MOMN。

更新日期：2020-04-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>