Elimination of All Bad Local Minima in Deep Learning,arXiv - CS - Neural and Evolutionary Computing

当前位置： X-MOL 学术 › arXiv.cs.NE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Elimination of All Bad Local Minima in Deep Learning
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2019-01-02 , DOI: arxiv-1901.00279
Kenji Kawaguchi, Leslie Pack Kaelbling

In this paper, we theoretically prove that adding one special neuron per output unit eliminates all suboptimal local minima of any deep neural network, for multi-class classification, binary classification, and regression with an arbitrary loss function, under practical assumptions. At every local minimum of any deep neural network with these added neurons, the set of parameters of the original neural network (without added neurons) is guaranteed to be a global minimum of the original neural network. The effects of the added neurons are proven to automatically vanish at every local minimum. Moreover, we provide a novel theoretical characterization of a failure mode of eliminating suboptimal local minima via an additional theorem and several examples. This paper also introduces a novel proof technique based on the perturbable gradient basis (PGB) necessary condition of local minima, which provides new insight into the elimination of local minima and is applicable to analyze various models and transformations of objective functions beyond the elimination of local minima.

中文翻译：

消除深度学习中所有不好的局部最小值

在本文中，我们从理论上证明，在实际假设下，对于多类分类、二元分类和具有任意损失函数的回归，每个输出单元添加一个特殊神经元可以消除任何深度神经网络的所有次优局部最小值。在具有这些添加神经元的任何深度神经网络的每个局部最小值处，原始神经网络（未添加神经元）的参数集保证是原始神经网络的全局最小值。事实证明，添加的神经元的影响会在每个局部最小值处自动消失。此外，我们通过附加定理和几个例子提供了一种新的理论表征，描述了消除次优局部最小值的故障模式。

更新日期：2020-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文