A geometric approach of gradient descent algorithms in linear neural networks,Mathematical Control and Related Fields

当前位置： X-MOL 学术 › Math. Control Relat. Fields › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A geometric approach of gradient descent algorithms in linear neural networks
Mathematical Control and Related Fields ( IF 1.0 ) Pub Date : 2022-01-01 , DOI: 10.3934/mcrf.2022021
Yacine Chitour ₁ , Zhenyu Liao ₂ , Romain Couillet ₃

Affiliation

In this paper, we propose a geometric framework to analyze the convergence properties of gradient descent trajectories in the context of linear neural networks. We translate a well-known empirical observation of linear neural nets into a conjecture that we call the overfitting conjecture which states that, for almost all training data and initial conditions, the trajectory of the corresponding gradient descent system converges to a global minimum. This would imply that the solution achieved by vanilla gradient descent algorithms is equivalent to that of the least-squares estimation, for linear neural networks of an arbitrary number of hidden layers. Built upon a key invariance property induced by the network structure, we first establish convergence of gradient descent trajectories to critical points of the square loss function in the case of linear networks of arbitrary depth. Our second result is the proof of the overfitting conjecture in the case of single-hidden-layer linear networks with an argument based on the notion of normal hyperbolicity and under a generic property on the training data (i.e., holding for almost all training data).

中文翻译：

线性神经网络中梯度下降算法的一种几何方法

在本文中，我们提出了一个几何框架来分析线性神经网络背景下梯度下降轨迹的收敛特性。我们将一个著名的线性神经网络经验观察转化为一个猜想，我们称之为过拟合猜想，它指出，对于几乎所有的训练数据和初始条件，相应梯度下降系统的轨迹收敛到全球最低。这意味着对于任意数量的隐藏层的线性神经网络，普通梯度下降算法实现的解决方案等效于最小二乘估计的解决方案。建立在由网络结构引起的关键不变性的基础上，在任意深度的线性网络的情况下，我们首先建立梯度下降轨迹收敛到平方损失函数的临界点。我们的第二个结果是在单隐藏层线性网络的情况下证明过拟合猜想，其参数基于正态双曲线概念和训练数据的通用属性（即，几乎所有的训练数据）。

更新日期：2022-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11