当前位置: X-MOL 学术Phys. Rev. X › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling the influence of data structure on learning in neural networks: The Hidden Manifold Model
Physical Review X ( IF 12.5 ) Pub Date : 
Sebastian Goldt, Marc Mézard, Florent Krzakala, Lenka Zdeborová

Understanding the reasons for the success of deep neural networks trained using stochastic gradient-based methods is a key open problem for the nascent theory of deep learning. The types of data where these networks are most successful, such as images or sequences of speech, are characterised by intricate correlations. Yet, most theoretical work on neural networks does not explicitly model training data, or assumes that elements of each data sample are drawn independently from some factorised probability distribution. These approaches are thus by construction blind to the correlation structure of real-world data sets and their impact on learning in neural networks. Here, we introduce a generative model for structured data sets that we call the hidden manifold model (HMM). The idea is to construct high-dimensional inputs that lie on a lower-dimensional manifold, with labels that depend only on their position within this manifold, akin to a single layer decoder or generator in a generative adversarial network. We demonstrate that learning of the hidden manifold model is amenable to an analytical treatment by proving a "Gaussian Equivalence Property’’ (GEP), and we use the GEP to show how the dynamics of two-layer neural networks trained using one-pass stochastic gradient descent is captured by a set of integro-differential equations that track the performance of the network at all times. This permits us to analyse in detail how a neural network learns functions of increasing complexity during training, how its performance depends on its size and how it is impacted by parameters such as the learning rate or the dimension of the hidden manifold.

中文翻译:

建模数据结构对神经网络学习的影响:隐藏流形模型

理解基于随机梯度的方法训练的深度神经网络成功的原因是新生的深度学习理论的一个关键开放问题。这些网络最成功的数据类型(例如图像或语音序列)以复杂的相关性为特征。但是,有关神经网络的大多数理论工作都没有明确地对训练数据建模,也没有假设每个数据样本的元素都是独立于某种分解概率分布而绘制的。因此,这些方法是通过构造而对现实世界数据集的相关性结构及其对神经网络学习的影响视而不见的。在这里,我们为结构化数据集介绍了一个生成模型,我们将其称为隐藏流形模型(HMM)。想法是构造一个位于低维流形上的高维输入,其标签仅取决于它们在该流形中的位置,类似于生成对抗网络中的单层解码器或生成器。我们通过证明“高斯等效属性”(GEP)证明了对隐藏歧管模型的学习适合进行分析处理,并且我们使用GEP来展示如何使用单次随机训练来训练两层神经网络的动力学梯度下降是由一组始终跟踪网络性能的整数微分方程式捕获的,这使我们能够详细分析神经网络如何在训练过程中学习越来越复杂的功能,
更新日期:2020-09-08
down
wechat
bug