当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Training Deep Architectures Without End-to-End Backpropagation: A Brief Survey
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2021-01-09 , DOI: arxiv-2101.03419
Shiyu Duan, Jose C. Principe

This tutorial paper surveys training alternatives to end-to-end backpropagation (E2EBP) -- the de facto standard for training deep architectures. Modular training refers to strictly local training without both the forward and the backward pass, i.e., dividing a deep architecture into several nonoverlapping modules and training them separately without any end-to-end operation. Between the fully global E2EBP and the strictly local modular training, there are "weakly modular" hybrids performing training without the backward pass only. These alternatives can match or surpass the performance of E2EBP on challenging datasets such as ImageNet, and are gaining increased attention primarily because they offer practical advantages over E2EBP, which will be enumerated herein. In particular, they allow for greater modularity and transparency in deep learning workflows, aligning deep learning with the mainstream computer science engineering that heavily exploits modularization for scalability. Modular training has also revealed novel insights about learning and may have further implications on other important research domains. Specifically, it induces natural and effective solutions to some important practical problems such as data efficiency and transferability estimation.

中文翻译:

在没有端到端反向传播的情况下训练深度架构:简要调查

本教程文件概述了端对端反向传播(E2EBP)的培训替代方法-端到端反向传播(E2EBP),这是培训深度架构的事实上的标准。模块化训练是指严格的本地训练,而没有前进和后退通道,即,将深度架构划分为几个不重叠的模块,并分别进行训练,而无需任何端到端操作。在完全全局的E2EBP和严格的本地模块化培训之间,存在“弱模块化”混合动力,它们仅在不进行反向传递的情况下进行培训。这些替代方案可以在具有挑战性的数据集(例如ImageNet)上与E2EBP匹配或超过E2EBP的性能,并且得到越来越多的关注,这主要是因为它们提供了优于E2EBP的实际优势,在此将进行列举。特别是,它们允许在深度学习工作流程中实现更大的模块化和透明度,从而使深度学习与大量利用模块化来实现可伸缩性的主流计算机科学工程保持一致。模块化培训还揭示了有关学习的新颖见解,并可能对其他重要研究领域产生进一步的影响。具体来说,它为某些重要的实际问题(如数据效率和可传输性估计)提供了自然而有效的解决方案。
更新日期:2021-01-12
down
wechat
bug