当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learn Faster and Forget Slower via Fast and Stable Task Adaptation
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2020-07-02 , DOI: arxiv-2007.01388
Farshid Varno and Lucas May Petry and Lisa Di Jorio and Stan Matwin

Training Deep Neural Networks (DNNs) is still highly time-consuming and compute-intensive. It has been shown that adapting a pretrained model may significantly accelerate this process. With a focus on classification, we show that current fine-tuning techniques make the pretrained models catastrophically forget the transferred knowledge even before anything about the new task is learned. Such rapid knowledge loss undermines the merits of transfer learning and may result in a much slower convergence rate compared to when the maximum amount of knowledge is exploited. We investigate the source of this problem from different perspectives and to alleviate it, introduce Fast And Stable Task-adaptation (FAST), an easy to apply fine-tuning algorithm. The paper provides a novel geometric perspective on how the loss landscape of source and target tasks are linked in different transfer learning strategies. We empirically show that compared to prevailing fine-tuning practices, FAST learns the target task faster and forgets the source task slower. The code is available at https://github.com/fvarno/FAST.

中文翻译:

通过快速稳定的任务适应,学得更快,忘得更慢

训练深度神经网络 (DNN) 仍然非常耗时且计算密集。已经表明,采用预训练模型可能会显着加速这一过程。以分类为重点,我们展示了当前的微调技术使预训练模型灾难性地忘记了迁移的知识,甚至在学习新任务的任何内容之前。这种快速的知识损失破坏了迁移学习的优点,并且与利用最大知识量时相比,可能导致收敛速度慢得多。我们从不同的角度调查了这个问题的根源,并为了缓解它,引入了快速稳定的任务适应 (FAST),一种易于应用的微调算法。该论文提供了一种新颖的几何视角,说明源和目标任务的损失情况如何在不同的迁移学习策略中联系起来。我们凭经验表明,与流行的微调实践相比,FAST 更快地学习目标任务而更慢地忘记源任务。代码可在 https://github.com/fvarno/FAST 获得。
更新日期:2020-07-06
down
wechat
bug