当前位置: X-MOL 学术Stat › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Weight normalized deep neural networks
Stat ( IF 0.7 ) Pub Date : 2020-12-08 , DOI: 10.1002/sta4.344
Yixi Xu 1 , Xiao Wang 2
Affiliation  

The generalization error is the difference between the expected risk and the empirical risk of a learning algorithm. This generalization error can be upper bounded by the Rademacher complexity of the underlying hypothesis class with high probability. This paper studies the function class of Lp, q weight normalized deep neural networks. We present a general framework for norm‐based capacity control and functional characterization for this class. In particular, we have established the upper bound on the Rademacher complexities of this family. Especially, with an L1,  normalization, we discuss properties of a width‐independent capacity control, which only depends on the depth by a square root term. Furthermore, if the activation functions are anti‐symmetric, the bound on the Rademacher complexity is independent of both the width and the depth up to a log factor.

中文翻译:

权重归一化深度神经网络

泛化误差是学习算法的预期风险和经验风险之间的差。该泛化错误可能会以较高的概率被基础假设类别的Rademacher复杂度所限制。本文研究L p,  q权重归一化深度神经网络的功能类别。我们为此类提供了一个基于规范的容量控制和功能表征的通用框架。特别是,我们已经确定了该家族Rademacher复杂性的上限。特别是,与大号1,  归一化,我们讨论了宽度独立的容量控制的属性,该属性仅取决于深度的平方根项。此外,如果激活函数是反对称的,则Rademacher复杂度的界限与宽度和深度都无关,直到对数因子为止。
更新日期:2021-02-04
down
wechat
bug