当前位置: X-MOL 学术Appl. Comput. Harmon. Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
LU decomposition and Toeplitz decomposition of a neural network
Applied and Computational Harmonic Analysis ( IF 2.5 ) Pub Date : 2023-10-06 , DOI: 10.1016/j.acha.2023.101601
Yucong Liu , Simiao Jiao , Lek-Heng Lim

Any matrix A has an LU decomposition up to a row or column permutation. Less well-known is the fact that it has a ‘Toeplitz decomposition’ A=T1T2Tr where Ti's are Toeplitz matrices. We will prove that any continuous function f:RnRm has an approximation to arbitrary accuracy by a neural network that maps xRn to L1σ1U1σ2L2σ3U2Lrσ2r1UrxRm, i.e., where the weight matrices alternate between lower and upper triangular matrices, σi(x)σ(xbi) for some bias vector bi, and the activation σ may be chosen to be essentially any uniformly continuous nonpolynomial function. The same result also holds with Toeplitz matrices, i.e., fT1σ1T2σ2σr1Tr to arbitrary accuracy, and likewise for Hankel matrices. A consequence of our Toeplitz result is a fixed-width universal approximation theorem for convolutional neural networks, which so far have only arbitrary width versions. Since our results apply in particular to the case when f is a general neural network, we may regard them as LU and Toeplitz decompositions of a neural network. The practical implication of our results is that one may vastly reduce the number of weight parameters in a neural network without sacrificing its power of universal approximation. We will present several experiments on real data sets to show that imposing such structures on the weight matrices dramatically reduces the number of training parameters with almost no noticeable effect on test accuracy.



中文翻译:

神经网络的 LU 分解和 Toeplitz 分解

任何矩阵A都具有直到行或列排列的 LU 分解。不太为人所知的是它具有“托普利茨分解”的事实A=时间1时间2时间r在哪里时间是托普利茨矩阵。我们将证明任意连续函数Fn通过映射的神经网络具有任意精度的近似值XεnL1σ1U1σ2L2σ3U2Lrσ2r-1UrXε,即权重矩阵在下三角矩阵和上三角矩阵之间交替,σXσX-对于一些偏置向量,并且激活σ可以被选择为基本上任何一致连续的非多项式函数。同样的结果也适用于 Toeplitz 矩阵,即F时间1σ1时间2σ2σr-1时间r任意精度,汉克尔矩阵也是如此。我们的托普利茨结果的一个结果是卷积神经网络的固定宽度通用逼近定理,迄今为止只有任意宽度版本。由于我们的结果特别适用于f是一般神经网络的情况,因此我们可以将它们视为神经网络的 LU 和 Toeplitz 分解。我们的结果的实际意义是,可以大大减少神经网络中权重参数的数量,而不牺牲其通用逼近的能力。我们将在真实数据集上进行几个实验,以表明将这种结构应用于权重矩阵可以显着减少训练参数的数量,而对测试精度几乎没有明显影响。

更新日期:2023-10-06
down
wechat
bug