Topological Properties of the Set of Functions Generated by Neural Networks of Fixed Size,Foundations of Computational Mathematics

当前位置： X-MOL 学术 › Found. Comput. Math. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Topological Properties of the Set of Functions Generated by Neural Networks of Fixed Size
Foundations of Computational Mathematics ( IF 2.5 ) Pub Date : 2020-05-14 , DOI: 10.1007/s10208-020-09461-0
Philipp Petersen , Mones Raslan , Felix Voigtlaender

We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties. It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to \(L^p\)-norms, \(0< p < \infty \), for all practically used activation functions, and also not closed with respect to the \(L^\infty \)-norm for all practically used activation functions except for the ReLU and the parametric ReLU. Finally, the function that maps a family of weights to the function computed by the associated network is not inverse stable for every practically used activation function. In other words, if \(f_1, f_2\) are two functions realized by neural networks and if \(f_1, f_2\) are close in the sense that \(\Vert f_1 - f_2\Vert _{L^\infty } \le \varepsilon \) for \(\varepsilon > 0\), it is, regardless of the size of \(\varepsilon \), usually not possible to find weights \(w_1, w_2\) close together such that each \(f_i\) is realized by a neural network with weights \(w_i\). Overall, our findings identify potential causes for issues in the training procedure of deep learning such as no guaranteed convergence, explosion of parameters, and slow convergence.

中文翻译：

固定大小的神经网络生成的函数集的拓扑性质

我们分析了可以由固定大小的神经网络实现的功能集的拓扑特性。出人意料的是，该组具有许多不良特性。它是高度非凸的，除了可能有一些特殊的激活函数。此外，对于所有实际使用的激活函数，该集合都不相对于\（L ^ p \）-范数\（0 <p <\ infty \）不闭合，并且也不相对于\（L ^ \ infty \）-规范所有实际使用的激活函数，除了ReLU和参数化ReLU。最后，对于每个实际使用的激活函数，将权重族映射到关联网络所计算的函数的函数并不是反稳定的。换句话说，如果\（f_1，f_2 \）是通过神经网络，并且如果实现两个功能\（F_1，F_2 \）接近在这个意义上\（\ Vert的F_1 - F_2 \ Vert的_ {L ^ \ infty} \文件\ varepsilon \）为\（\ varepsilon> 0 \）的情况下，无论\（\ varepsilon \）的大小如何，通常都不可能找到权重\（w_1，w_2 \）靠得很近，使得每个\（f_i \）由具有权重的神经网络实现\（w_i \）。总体而言，我们的发现确定了深度学习训练过程中潜在问题的原因，例如无法保证收敛，参数爆炸和收敛缓慢。

更新日期：2020-05-14

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11