Knowledge Distillation: A Survey,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Knowledge Distillation: A Survey
International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2021-03-22 , DOI: 10.1007/s11263-021-01453-z
Jianping Gou , Baosheng Yu , Stephen J. Maybank , Dacheng Tao

In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedded devices, not only because of the high computational complexity but also the large storage requirements. To this end, a variety of model compression and acceleration techniques have been developed. As a representative type of model compression and acceleration, knowledge distillation effectively learns a small student model from a large teacher model. It has received rapid increasing attention from the community. This paper provides a comprehensive survey of knowledge distillation from the perspectives of knowledge categories, training schemes, teacher–student architecture, distillation algorithms, performance comparison and applications. Furthermore, challenges in knowledge distillation are briefly reviewed and comments on future research are discussed and forwarded.

中文翻译：

知识提炼：调查

近年来，深度神经网络在行业和学术界都取得了成功，尤其是在计算机视觉任务方面。深度学习的巨大成功主要归功于其可扩展性以编码大规模数据并操纵数十亿个模型参数。然而，不仅由于计算复杂度高而且存储需求大，将这些繁琐的深度模型部署在资源有限的设备（例如移动电话和嵌入式设备）上也是一个挑战。为此，已经开发了多种模型压缩和加速技术。作为模型压缩和加速的代表类型，知识提炼有效地从大型教师模型中学习小型学生模型。它已迅速受到社区的关注。本文从知识类别，培训计划，师生架构，提炼算法，性能比较和应用的角度对知识提炼进行了全面的调查。此外，简要概述了知识蒸馏中的挑战，并讨论和转发了对未来研究的评论。

更新日期：2021-05-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11