当前位置: X-MOL 学术Neural Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Associated Learning: Decomposing End-to-End Backpropagation Based on Autoencoders and Target Propagation
Neural Computation ( IF 2.9 ) Pub Date : 2021-01-01 , DOI: 10.1162/neco_a_01335
Yu-Wei Kao, Hung-Hsuan Chen

Backpropagation (BP) is the cornerstone of today's deep learning algorithms, but it is inefficient partially because of backward locking, which means updating the weights of one layer locks the weight updates in the other layers. Consequently, it is challenging to apply parallel computing or a pipeline structure to update the weights in different layers simultaneously. In this letter, we introduce a novel learning structure, associated learning (AL), that modularizes the network into smaller components, each of which has a local objective. Because the objectives are mutually independent, AL can learn the parameters in different layers independently and simultaneously, so it is feasible to apply a pipeline structure to improve the training throughput. Specifically, this pipeline structure improves the complexity of the training time from O(nℓ), which is the time complexity when using BP and stochastic gradient descent (SGD) for training, to O(n+ℓ), where n is the number of training instances and ℓ is the number of hidden layers. Surprisingly, even though most of the parameters in AL do not directly interact with the target variable, training deep models by this method yields accuracies comparable to those from models trained using typical BP methods, in which all parameters are used to predict the target variable. Consequently, because of the scalability and the predictive power demonstrated in the experiments, AL deserves further study to determine the better hyperparameter settings, such as activation function selection, learning rate scheduling, and weight initialization, to accumulate experience, as we have done over the years with the typical BP method. In addition, perhaps our design can also inspire new network designs for deep learning. Our implementation is available at https://github.com/SamYWK/Associated_Learning.

中文翻译:

相关学习:分解基于自编码器和目标传播的端到端反向传播

反向传播 (BP) 是当今深度学习算法的基石,但它效率低下的部分原因是反向锁定,这意味着更新一层的权重会锁定其他层的权重更新。因此,应用并行计算或流水线结构同时更新不同层的权重是具有挑战性的。在这封信中,我们介绍了一种新颖的学习结构,关联学习 (AL),它将网络模块化为更小的组件,每个组件都有一个本地目标。由于目标是相互独立的,AL可以独立同时学习不同层的参数,因此应用流水线结构来提高训练吞吐量是可行的。具体来说,这种流水线结构从 O(nℓ) 提高了训练时间的复杂度,这是使用 BP 和随机梯度下降 (SGD) 进行训练时的时间复杂度为 O(n+ℓ),其中 n 是训练实例的数量,ℓ 是隐藏层的数量。令人惊讶的是,即使 AL 中的大多数参数不直接与目标变量交互,通过这种方法训练深度模型产生的精度与使用典型 BP 方法训练的模型相当,其中所有参数都用于预测目标变量。因此,由于实验中展示的可扩展性和预测能力,AL 值得进一步研究以确定更好的超参数设置,例如激活函数选择、学习率调度和权重初始化,以积累经验,正如我们在过去所做的那样年使用典型的 BP 方法。此外,也许我们的设计也可以激发深度学习的新网络设计。我们的实现可在 https://github.com/SamYWK/Associated_Learning 获得。
更新日期:2021-01-01
down
wechat
bug