Evolutionary Multi-Objective Model Compression for Deep Neural Networks,IEEE Computational Intelligence Magazine

当前位置： X-MOL 学术 › IEEE Comput. Intell. Mag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Evolutionary Multi-Objective Model Compression for Deep Neural Networks
IEEE Computational Intelligence Magazine ( IF 10.3 ) Pub Date : 2021-07-21 , DOI: 10.1109/mci.2021.3084393
Zhehui Wang , Tao Luo , Miqing Li , Joey Tianyi Zhou , Rick Siow Mong Goh , Liangli Zhen

While deep neural networks (DNNs) deliver state-of-the-art accuracy on various applications from face recognition to language translation, it comes at the cost of high computational and space complexity, hindering their deployment on edge devices. To enable efficient processing of DNNs in inference, a novel approach, called Evolutionary Multi-Objective Model Compression (EMOMC), is proposed to optimize energy efficiency (or model size) and accuracy simultaneously. Specifically, the network pruning and quantization space are explored and exploited by using architecture population evolution. Furthermore, by taking advantage of the orthogonality between pruning and quantization, a two-stage pruning and quantization co-optimization strategy is developed, which considerably reduces time cost of the architecture search. Lastly, different dataflow designs and parameter coding schemes are considered in the optimization process since they have a significant impact on energy consumption and the model size. Owing to the cooperation of the evolution between different architectures in the population, a set of compact DNNs that offer trade-offs on different objectives (e.g., accuracy, energy efficiency and model size) can be obtained in a single run. Unlike most existing approaches designed to reduce the size of weight parameters with no significant loss of accuracy, the proposed method aims to achieve a trade-off between desirable objectives, for meeting different requirements of various edge devices. Experimental results demonstrate that the proposed approach can obtain a diverse population of compact DNNs that are suitable for a broad range of different memory usage and energy consumption requirements. Under negligible accuracy loss, EMOMC improves the energy efficiency and model compression rate of VGG-16 on CIFAR-10 by a factor of more than 8 9. X and 2.4 X, respectively.

中文翻译：

深度神经网络的进化多目标模型压缩

虽然深度神经网络 (DNN) 在从人脸识别到语言翻译的各种应用中提供了最先进的准确性，但它是以高计算和空间复杂性为代价的，阻碍了它们在边缘设备上的部署。为了在推理中实现 DNN 的高效处理，提出了一种称为进化多目标模型压缩 (EMOMC) 的新方法来同时优化能量效率（或模型大小）和准确性。具体来说，通过使用架构群体进化来探索和利用网络剪枝和量化空间。此外，通过利用剪枝和量化之间的正交性，开发了两阶段剪枝和量化协同优化策略，大大减少了架构搜索的时间成本。最后，在优化过程中考虑不同的数据流设计和参数编码方案，因为它们对能耗和模型大小有显着影响。由于种群中不同架构之间进化的合作，可以在一次运行中获得一组紧凑的 DNN，它们可以在不同目标（例如，准确性、能源效率和模型大小）上进行权衡。与大多数现有的旨在减小权重参数的大小而不显着损失精度的方法不同，所提出的方法旨在实现所需目标之间的权衡，以满足各种边缘设备的不同要求。实验结果表明，所提出的方法可以获得多种紧凑的 DNN，适用于各种不同的内存使用和能耗要求。在精度损失可以忽略不计的情况下，EMOMC 将 VGG-16 在 CIFAR-10 上的能效和模型压缩率分别提高了 8 9. X 和 2.4 X 以上。

更新日期：2021-07-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11