当前位置: X-MOL 学术arXiv.cs.NA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Depth-Adaptive Neural Networks from the Optimal Control viewpoint
arXiv - CS - Numerical Analysis Pub Date : 2020-07-05 , DOI: arxiv-2007.02428
Joubine Aghili and Olga Mula

In recent years, deep learning has been connected with optimal control as a way to define a notion of a continuous underlying learning problem. In this view, neural networks can be interpreted as a discretization of a parametric Ordinary Differential Equation which, in the limit, defines a continuous-depth neural network. The learning task then consists in finding the best ODE parameters for the problem under consideration, and their number increases with the accuracy of the time discretization. Although important steps have been taken to realize the advantages of such continuous formulations, most current learning techniques fix a discretization (i.e. the number of layers is fixed). In this work, we propose an iterative adaptive algorithm where we progressively refine the time discretization (i.e. we increase the number of layers). Provided that certain tolerances are met across the iterations, we prove that the strategy converges to the underlying continuous problem. One salient advantage of such a shallow-to-deep approach is that it helps to benefit in practice from the higher approximation properties of deep networks by mitigating over-parametrization issues. The performance of the approach is illustrated in several numerical examples.

中文翻译:

从最优控制的角度看深度自适应神经网络

近年来,深度学习与最优控制相关联,作为定义连续潜在学习问题概念的一种方式。在这种观点下,神经网络可以被解释为参数常微分方程的离散化,在极限情况下,它定义了一个连续深度的神经网络。然后,学习任务包括为所考虑的问题寻找最佳 ODE 参数,并且它们的数量随着时间离散化的准确性而增加。虽然已经采取了重要的步骤来实现这种连续公式的优势,但大多数当前的学习技术固定离散化(即层数是固定的)。在这项工作中,我们提出了一种迭代自适应算法,我们逐步改进时间离散化(即我们增加层数)。假设在迭代中满足某些容差,我们证明该策略收敛于潜在的连续问题。这种从浅到深的方法的一个显着优势是,通过减轻过度参数化问题,它有助于在实践中受益于深层网络的更高逼近特性。在几个数值例子中说明了该方法的性能。
更新日期:2020-07-07
down
wechat
bug