当前位置:
X-MOL 学术
›
arXiv.cs.NE
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Elimination of All Bad Local Minima in Deep Learning
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2019-01-02 , DOI: arxiv-1901.00279 Kenji Kawaguchi, Leslie Pack Kaelbling
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2019-01-02 , DOI: arxiv-1901.00279 Kenji Kawaguchi, Leslie Pack Kaelbling
In this paper, we theoretically prove that adding one special neuron per
output unit eliminates all suboptimal local minima of any deep neural network,
for multi-class classification, binary classification, and regression with an
arbitrary loss function, under practical assumptions. At every local minimum of
any deep neural network with these added neurons, the set of parameters of the
original neural network (without added neurons) is guaranteed to be a global
minimum of the original neural network. The effects of the added neurons are
proven to automatically vanish at every local minimum. Moreover, we provide a
novel theoretical characterization of a failure mode of eliminating suboptimal
local minima via an additional theorem and several examples. This paper also
introduces a novel proof technique based on the perturbable gradient basis
(PGB) necessary condition of local minima, which provides new insight into the
elimination of local minima and is applicable to analyze various models and
transformations of objective functions beyond the elimination of local minima.
中文翻译:
消除深度学习中所有不好的局部最小值
在本文中,我们从理论上证明,在实际假设下,对于多类分类、二元分类和具有任意损失函数的回归,每个输出单元添加一个特殊神经元可以消除任何深度神经网络的所有次优局部最小值。在具有这些添加神经元的任何深度神经网络的每个局部最小值处,原始神经网络(未添加神经元)的参数集保证是原始神经网络的全局最小值。事实证明,添加的神经元的影响会在每个局部最小值处自动消失。此外,我们通过附加定理和几个例子提供了一种新的理论表征,描述了消除次优局部最小值的故障模式。
更新日期:2020-01-17
中文翻译:
消除深度学习中所有不好的局部最小值
在本文中,我们从理论上证明,在实际假设下,对于多类分类、二元分类和具有任意损失函数的回归,每个输出单元添加一个特殊神经元可以消除任何深度神经网络的所有次优局部最小值。在具有这些添加神经元的任何深度神经网络的每个局部最小值处,原始神经网络(未添加神经元)的参数集保证是原始神经网络的全局最小值。事实证明,添加的神经元的影响会在每个局部最小值处自动消失。此外,我们通过附加定理和几个例子提供了一种新的理论表征,描述了消除次优局部最小值的故障模式。