当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhancing Data-Free Adversarial Distillation with Activation Regularization and Virtual Interpolation
arXiv - CS - Artificial Intelligence Pub Date : 2021-02-23 , DOI: arxiv-2102.11638
Xiaoyang Qu, Jianzong Wang, Jing Xiao

Knowledge distillation refers to a technique of transferring the knowledge from a large learned model or an ensemble of learned models to a small model. This method relies on access to the original training set, which might not always be available. A possible solution is a data-free adversarial distillation framework, which deploys a generative network to transfer the teacher model's knowledge to the student model. However, the data generation efficiency is low in the data-free adversarial distillation. We add an activation regularizer and a virtual interpolation method to improve the data generation efficiency. The activation regularizer enables the students to match the teacher's predictions close to activation boundaries and decision boundaries. The virtual interpolation method can generate virtual samples and labels in-between decision boundaries. Our experiments show that our approach surpasses state-of-the-art data-free distillation methods. The student model can achieve 95.42% accuracy on CIFAR-10 and 77.05% accuracy on CIFAR-100 without any original training data. Our model's accuracy is 13.8% higher than the state-of-the-art data-free method on CIFAR-100.

中文翻译:

通过激活规则化和虚拟插值增强无数据对抗蒸馏

知识提炼是指将知识从大型学习模型或整体学习模型转移到小型模型的技术。此方法依赖于对原始训练集的访问,而原始训练集可能并不总是可用。一种可能的解决方案是无数据对抗蒸馏框架,该框架部署了一个生成网络,以将教师模型的知识转移到学生模型。但是,在无数据对抗蒸馏中,数据生成效率低。我们添加了激活规则化器和虚拟插值方法以提高数据生成效率。激活规则化器使学生能够在激活边界和决策边界附近匹配教师的预测。虚拟插值方法可以在决策边界之间生成虚拟样本和标签。我们的实验表明,我们的方法超越了最新的无数据蒸馏方法。在没有任何原始训练数据的情况下,学生模型可以在CIFAR-10上达到95.42%的精度,在CIFAR-100上可以达到77.05%的精度。我们模型的准确性比CIFAR-100上最新的无数据方法高13.8%。
更新日期:2021-02-24
down
wechat
bug