Sparse Deep Neural Networks Using L1,∞-Weight Normalization,Statistica Sinica

当前位置： X-MOL 学术 › Stat. Sin. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Sparse Deep Neural Networks Using L1,∞-Weight Normalization
Statistica Sinica ( IF 1.4 ) Pub Date : 2021-01-01 , DOI: 10.5705/ss.202018.0468
Ming Wen , Yixi Xu , Yunling Zheng , Zhouwang Yang , Xiao Wang

Deep neural networks have recently demonstrated an amazing performance on many challenging tasks. Overfitting is one of the notorious features for DNNs. Empirical evidence suggests that inducing sparsity can relieve overfitting, and weight normalization can accelerate the algorithm convergence. In this paper, we study L1,∞-weight normalization for deep neural networks with bias neurons to achieve the sparse architecture. We theoretically establish the generalization error bounds for both regression and classification under the L1,∞-weight normalization. It is shown that the upper bounds are independent of the network width and √ k-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justifications on the usage of such weight normalization to reduce the generalization error. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. We perform various experiments to validate our theory and demonstrate the effectiveness of the resulting approach.

中文翻译：

使用 L1,∞-权重归一化的稀疏深度神经网络

深度神经网络最近在许多具有挑战性的任务上表现出了惊人的表现。过拟合是 DNN 臭名昭著的特征之一。经验证据表明，引入稀疏可以缓解过拟合，权重归一化可以加速算法收敛。在本文中，我们研究了具有偏置神经元的深度神经网络的 L1,∞ 权重归一化，以实现稀疏架构。我们在理论上建立了 L1,∞ 权重归一化下回归和分类的泛化误差界限。结果表明，上限与网络宽度无关，√ k 依赖于网络深度 k，这是具有偏置神经元的网络的最佳可用边界。这些结果为使用这种权重归一化来减少泛化误差提供了理论依据。我们还开发了一种易于实现的梯度投影下降算法来实际获得稀疏神经网络。我们进行了各种实验来验证我们的理论并证明所得方法的有效性。

更新日期：2021-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>