当前位置: X-MOL 学术Phys. D Nonlinear Phenom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Symmetry & critical points for a model shallow neural network
Physica D: Nonlinear Phenomena ( IF 4 ) Pub Date : 2021-08-26 , DOI: 10.1016/j.physd.2021.133014
Yossi Arjevani 1 , Michael Field 2
Affiliation  

Using methods based on the analysis of real analytic functions, symmetry and equivariant bifurcation theory, we obtain sharp results on families of critical points of spurious minima that occur in optimization problems associated with fitting two-layer ReLU networks with k hidden neurons. The main mathematical result proved is to obtain power series representations of families of critical points of spurious minima in terms of 1/k (coefficients independent of k). We also give a path based formulation that naturally connects the critical points with critical points of an associated linear, but highly singular, optimization problem. These critical points closely approximate the critical points in the original problem.

The mathematical theory is used to derive results on the original problem in neural nets. For example, precise estimates for several quantities that show that not all spurious minima are alike. In particular, we show that while the loss function at certain types of spurious minima decays to zero like k1, in other cases the loss converges to a strictly positive constant.



中文翻译:

模型浅层神经网络的对称性和临界点

使用基于实解析函数分析、对称性和等变分岔理论的方法,我们在与拟合两层 ReLU 网络相关的优化问题中出现的伪最小值的临界点族中获得了清晰的结果 隐藏的神经元。证明的主要数学结果是获得伪最小值临界点族的幂级数表示1/ (系数独立于 )。我们还给出了一个基于路径的公式,它自然地将临界点与相关线性但高度奇异的优化问题的临界点连接起来。这些临界点非常接近原始问题中的临界点。

数学理论用于推导出神经网络中原始问题的结果。例如,几个数量的精确估计表明并非所有虚假最小值都相同。特别是,我们表明,虽然某些类型的虚假最小值处的损失函数衰减为零,例如-1,在其他情况下,损失收敛到一个严格的正常数。

更新日期:2021-09-06
down
wechat
bug