Can stable and accurate neural networks be computed? -- On the barriers of deep learning and Smale's 18th problem,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Can stable and accurate neural networks be computed? -- On the barriers of deep learning and Smale's 18th problem
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-20 , DOI: arxiv-2101.08286
Vegard Antun, Matthew J. Colbrook, Anders C. Hansen

Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, DL suffers from a universal phenomenon: instability, despite universal approximating properties that often guarantee the existence of stable neural networks (NNs). We show the following paradox. There are basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities, however, there does not exist any algorithm, even randomised, that can train (or compute) such a NN. Indeed, for any positive integers $K > 2$ and $L$, there are cases where simultaneously: (a) no randomised algorithm can compute a NN correct to $K$ digits with probability greater than $1/2$, (b) there exists a deterministic algorithm that computes a NN with $K-1$ correct digits, but any such (even randomised) algorithm needs arbitrarily many training data, (c) there exists a deterministic algorithm that computes a NN with $K-2$ correct digits using no more than $L$ training samples. These results provide basic foundations for Smale's 18th problem and imply a potentially vast, and crucial, classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by initiating a unified theory for compressed sensing and DL, leading to sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce Fast Iterative REstarted NETworks (FIRENETs), which we prove and numerically verify are stable. Moreover, we prove that only $\mathcal{O}(|\log(\epsilon)|)$ layers are needed for an $\epsilon$ accurate solution to the inverse problem (exponential convergence), and that the inner dimensions in the layers do not exceed the dimension of the inverse problem. Thus, FIRENETs are computationally very efficient.

中文翻译：

可以计算出稳定而准确的神经网络吗？-关于深度学习的障碍和Smale的第18个问题

深度学习（DL）取得了空前的成功，现在正全力进入科学计算。但是，尽管DL具有普遍的近似特性（通常可以保证存在稳定的神经网络），但DL存在一种普遍的现象：不稳定。我们显示以下悖论。在科学计算中存在一些条件良好的基本问题，在这些问题中，可以证明NN具有很高的近似质量，但是，甚至没有随机算法可以训练（或计算）这种NN。实际上，对于任何正整数$ K> 2 $和$ L $，在某些情况下同时存在：（a）没有随机算法可以计算出概率大于$ 1/2 $的NN校正为$ K $位数，（b）存在一种确定性算法，该算法可计算具有$ K-1 $个正确数字的NN，但是任何这样的算法（甚至是随机算法）都需要任意多个训练数据，（c）存在一种确定性算法，该算法使用不超过$ L $个训练样本来计算具有$ K-2 $个正确数字的NN。这些结果为Smale的第18个问题提供了基础，并暗示了一种潜在的，巨大且至关重要的分类理论，该理论描述了可以通过算法计算出给定精度的（稳定）NN的条件。我们通过启动用于压缩感知和DL的统一理论来开始该理论，从而为计算反问题中的稳定NN的算法的存在提供了充分的条件。我们介绍了快速迭代重新启动的网络（FIRENET），我们对其进行了证明并通过数值验证了其稳定性。此外，我们证明只需$ \ mathcal {O}（| \ log（\ epsilon）|）$层即可获得反问题（指数收敛）的$ \ epsilon $精确解，并且这些层的内部尺寸可以不超过反问题的维数因此，FIRENET在计算上非常有效。

更新日期：2021-01-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>