Weight normalized deep neural networks,Stat

当前位置： X-MOL 学术 › Stat › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Weight normalized deep neural networks
Stat ( IF 0.7 ) Pub Date : 2020-12-08 , DOI: 10.1002/sta4.344
Yixi Xu ₁ , Xiao Wang ₂

Affiliation

The generalization error is the difference between the expected risk and the empirical risk of a learning algorithm. This generalization error can be upper bounded by the Rademacher complexity of the underlying hypothesis class with high probability. This paper studies the function class of L_p, q weight normalized deep neural networks. We present a general framework for norm‐based capacity control and functional characterization for this class. In particular, we have established the upper bound on the Rademacher complexities of this family. Especially, with an L_1, ∞ normalization, we discuss properties of a width‐independent capacity control, which only depends on the depth by a square root term. Furthermore, if the activation functions are anti‐symmetric, the bound on the Rademacher complexity is independent of both the width and the depth up to a log factor.

中文翻译：

权重归一化深度神经网络

泛化误差是学习算法的预期风险和经验风险之间的差。该泛化错误可能会以较高的概率被基础假设类别的Rademacher复杂度所限制。本文研究L _{p， q}权重归一化深度神经网络的功能类别。我们为此类提供了一个基于规范的容量控制和功能表征的通用框架。特别是，我们已经确定了该家族Rademacher复杂性的上限。特别是，与大号_{1， ∞}归一化，我们讨论了宽度独立的容量控制的属性，该属性仅取决于深度的平方根项。此外，如果激活函数是反对称的，则Rademacher复杂度的界限与宽度和深度都无关，直到对数因子为止。

更新日期：2021-02-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文