当前位置: X-MOL 学术Random Matrices Theory Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On random matrices arising in deep neural networks: General I.I.D. case
Random Matrices: Theory and Applications ( IF 0.9 ) Pub Date : 2022-07-14 , DOI: 10.1142/s2010326322500460
Leonid Pastur 1 , Victor Slavin 1
Affiliation  

We study the eigenvalue distribution of random matrices pertinent to the analysis of deep neural networks. The matrices resemble the product of the sample covariance matrices, however, an important difference is that the analog of the population covariance matrix is now a function of random data matrices (synaptic weight matrices in the deep neural network terminology). The problem has been treated in recent work [J. Pennington, S. Schoenholz and S. Ganguli, The emergence of spectral universality in deep networks, Proc. Mach. Learn. Res. 84 (2018) 1924–1932, arXiv:1802.09979] by using the techniques of free probability theory. Since, however, free probability theory deals with population covariance matrices which are independent of the data matrices, its applicability in this case has to be justified. The justification has been given in [L. Pastur, On random matrices arising in deep neural networks: Gaussian case, Pure Appl. Funct. Anal. (2020), in press, arXiv:2001.06188] for Gaussian data matrices with independent entries, a standard analytical model of free probability, by using a version of the techniques of random matrix theory. In this paper, we use another version of the techniques to extend the results of [L. Pastur, On random matrices arising in deep neural networks: Gaussian case, Pure Appl. Funct. Anal. (2020), in press, arXiv:2001.06188] to the case where the entries of the data matrices are just independent identically distributed random variables with zero mean and finite fourth moment. This, in particular, justifies the mean field approximation in the infinite width limit for the deep untrained neural networks and the property of the macroscopic universality of random matrix theory in this case.



中文翻译:

关于深度神经网络中出现的随机矩阵:一般 IID 案例

我们研究了与深度神经网络分析相关的随机矩阵的特征值分布。这些矩阵类似于样本协方差矩阵的乘积,但是,一个重要的区别是总体协方差矩阵的模拟现在是随机数据矩阵(深度神经网络术语中的突触权重矩阵)的函数。这个问题在最近的工作中得到了处理[J. Pennington、S. Schoenholz 和 S. Ganguli,深度网络中光谱普遍性的出现,Proc。马赫。学习。水库。 84(2018) 1924–1932, arXiv:1802.09979] 通过使用自由概率论的技术。然而,由于自由概率论处理的是独立于数据矩阵的总体协方差矩阵,因此必须证明其在这种情况下的适用性。理由已在 [L. Pastur,关于深度神经网络中出现的随机矩阵:高斯案例,Pure Appl。功能。肛门。(2020), in press, arXiv:2001.06188] 对于具有独立条目的高斯数据矩阵,自由概率的标准分析模型,使用随机矩阵理论技术的一个版本。在本文中,我们使用该技术的另一个版本来扩展 [L. Pastur,关于深度神经网络中出现的随机矩阵:高斯案例,Pure Appl。功能。肛门。(2020), in press, arXiv:2001.06188] 到数据矩阵的条目只是具有零均值和有限四阶矩的独立同分布随机变量的情况。这尤其证明了深度未训练神经网络在无限宽度限制下的平均场近似以及随机矩阵理论在这种情况下的宏观普适性。

更新日期:2022-07-14
down
wechat
bug