An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning,arXiv - CS - Distributed, Parallel, and Cluster Computing

当前位置： X-MOL 学术 › arXiv.cs.DC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-11-20 , DOI: arxiv-2011.10170
Chengming Zhang, Geng Yuan, Wei Niu, Jiannan Tian, Sian Jin, Donglin Zhuang, Zhe Jiang, Yanzhi Wang, Bin Ren, Shuaiwen Leon Song, Dingwen Tao

Convolutional neural networks (CNNs) are becoming increasingly deeper, wider, and non-linear because of the growing demand on prediction accuracy and analysis quality. The wide and deep CNNs, however, require a large amount of computing resources and processing time. Many previous works have studied model pruning to improve inference performance, but little work has been done for effectively reducing training cost. In this paper, we propose ClickTrain: an efficient and accurate end-to-end training and pruning framework for CNNs. Different from the existing pruning-during-training work, ClickTrain provides higher model accuracy and compression ratio via fine-grained architecture-preserving pruning. By leveraging pattern-based pruning with our proposed novel accurate weight importance estimation, dynamic pattern generation and selection, and compiler-assisted computation optimizations, ClickTrain generates highly accurate and fast pruned CNN models for direct deployment without any time overhead, compared with the baseline training. ClickTrain also reduces the end-to-end time cost of the state-of-the-art pruning-after-training methods by up to about 67% with comparable accuracy and compression ratio. Moreover, compared with the state-of-the-art pruning-during-training approach, ClickTrain reduces the accuracy drop by up to 2.1% and improves the compression ratio by up to 2.2X on the tested datasets, under similar limited training time.

中文翻译：

通过基于细粒度模式的修剪的高效端到端深度学习培训框架

由于对预测准确性和分析质量的需求不断增长，卷积神经网络（CNN）变得越来越深，越来越宽和越来越非线性。但是，宽而深的CNN需要大量的计算资源和处理时间。先前的许多工作已经研究了模型修剪以提高推理性能，但是为有效降低培训成本所做的工作很少。在本文中，我们提出了ClickTrain：一种针对CNN的高效，准确的端到端培训和修剪框架。与现有的修剪过程中的培训工作不同，ClickTrain通过保留细粒度的体系结构修剪提供了更高的模型准确性和压缩率。通过将基于模式的修剪与我们提出的新颖的精确权重重要性估计，动态模式生成和选择结合起来，与基准编译相比，ClickTrain以及编译器辅助的计算优化，可生成高度准确且快速修剪的CNN模型，无需任何时间开销即可直接部署。ClickTrain还可以将最先进的训练后修剪方法的端到端时间成本降低多达67％，同时具有可比的精度和压缩比。此外，与最先进的修剪过程中训练方法相比，在相似的有限训练时间下，ClickTrain可以将测试数据集的准确度下降最多降低2.1％，压缩率提高多达2.2倍。ClickTrain还可以将最先进的训练后修剪方法的端到端时间成本降低多达67％，同时具有可比的精度和压缩比。此外，与最先进的修剪过程中训练方法相比，在相似的有限训练时间下，ClickTrain可以将测试数据集的准确度下降最多降低2.1％，压缩率提高多达2.2倍。ClickTrain还可以将最先进的训练后修剪方法的端到端时间成本降低多达67％，同时具有可比的精度和压缩比。此外，与最先进的修剪过程中训练方法相比，在相似的有限训练时间下，ClickTrain可以将测试数据集的准确度下降最多降低2.1％，压缩率提高多达2.2倍。

更新日期：2020-11-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文