Projection Based Weight Normalization: Efficient Method for Optimization on Oblique Manifold in DNNs,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Projection Based Weight Normalization: Efficient Method for Optimization on Oblique Manifold in DNNs
Pattern Recognition ( IF 7.5 ) Pub Date : 2020-09-01 , DOI: 10.1016/j.patcog.2020.107317
Lei Huang , Xianglong Liu , Jie Qin , Fan Zhu , Li Liu , Ling Shao

Abstract Optimizing deep neural networks (DNNs) often suffers from the ill-conditioned problem. We observe that the scaling based weight space symmetry (SBWSS) in rectified nonlinear network will cause this negative effect. Therefore, we propose to constrain the incoming weights of each neuron to be unit-norm, which is formulated as an optimization problem over the Oblique manifold. A simple yet efficient method referred to as projection based weight normalization (PBWN) is also developed to solve this problem. This proposed method has the property of regularization and collaborates well with the commonly used batch normalization technique. We conduct comprehensive experiments on several widely-used image datasets including CIFAR-10, CIFAR-100, SVHN and ImageNet for supervised learning over the state-of-the-art neural networks. The experimental results show that our method is able to improve the performance of different architectures consistently. We also apply our method to Ladder network for semi-supervised learning on permutation invariant MNIST dataset, and our method achievers the state-of-the-art methods: we obtain test errors as 2.52%, 1.06%, and 0.91% with only 20, 50, and 100 labeled samples, respectively.

中文翻译：

基于投影的权重归一化：DNN 中斜流形优化的有效方法

摘要优化深度神经网络 (DNN) 经常遇到病态问题。我们观察到整流非线性网络中基于缩放的权重空间对称性（SBWSS）会导致这种负面影响。因此，我们建议将每个神经元的传入权重约束为单位范数，这被表述为斜流形上的优化问题。还开发了一种称为基于投影的权重归一化 (PBWN) 的简单而有效的方法来解决这个问题。该方法具有正则化的特性，与常用的批量归一化技术配合良好。我们对几个广泛使用的图像数据集进行了全面的实验，包括 CIFAR-10、CIFAR-100、SVHN 和 ImageNet，用于对最先进的神经网络进行监督学习。实验结果表明，我们的方法能够一致地提高不同架构的性能。我们还将我们的方法应用于 Ladder 网络，对置换不变 MNIST 数据集进行半监督学习，我们的方法实现了最先进的方法：我们获得了 2.52%、1.06% 和 0.91% 的测试误差，只有 20 、 50 和 100 个标记样本，分别。

更新日期：2020-09-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11