Knowledge Distillation for Multi-task Learning,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Knowledge Distillation for Multi-task Learning
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-07-14 , DOI: arxiv-2007.06889
Wei-Hong Li and Hakan Bilen

Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics (e.g. cross-entropy, Euclidean loss), leading to the imbalance problem in multi-task learning. To address the imbalance problem, we propose a knowledge distillation based method in this work. We first learn a task-specific model for each task. We then learn the multi-task model for minimizing task-specific loss and for producing the same feature with task-specific models. As the task-specific network encodes different features, we introduce small task-specific adaptors to project multi-task features to the task-specific features. In this way, the adaptors align the task-specific feature and the multi-task feature, which enables a balanced parameter sharing across tasks. Extensive experimental results demonstrate that our method can optimize a multi-task learning model in a more balanced way and achieve better overall performance.

中文翻译：

多任务学习的知识提炼

多任务学习（MTL）是学习一个单一的模型来执行多个任务，以在所有任务上获得良好的性能并降低计算成本。学习这样的模型需要联合优化一组具有不同难度级别、幅度和特征（例如交叉熵、欧几里德损失）的任务的损失，导致多任务学习中的不平衡问题。为了解决不平衡问题，我们在这项工作中提出了一种基于知识蒸馏的方法。我们首先为每个任务学习一个特定于任务的模型。然后我们学习多任务模型，以最小化特定任务的损失并使用特定任务的模型产生相同的特征。由于特定于任务的网络对不同的特征进行编码，我们引入了小的特定于任务的适配器来将多任务特征投射到特定于任务的特征上。这样，适配器将特定于任务的功能和多任务功能对齐，从而实现跨任务的平衡参数共享。大量的实验结果表明，我们的方法可以以更平衡的方式优化多任务学习模型并获得更好的整体性能。

更新日期：2020-09-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>