当前位置: X-MOL 学术IEEE Trans. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating Deep Learning Systems via Critical Set Identification and Model Compression
IEEE Transactions on Computers ( IF 3.7 ) Pub Date : 2020-01-01 , DOI: 10.1109/tc.2020.2970917
Rui Han , Chi Harold Liu , Shilin Li , Shilin Wen , Xue Liu

Modern distributed engines are increasingly deployed to accelerate large-scaled deep learning (DL) training jobs. While the parallelism of distributed workers/nodes promises the scalability, the computation and communication overheads of the underlying iterative solving algorithms, e.g., stochastic gradient decent, unfortunately become the bottleneck for distributed DL training jobs. Existing approaches address such limitations by designing more efficient synchronization algorithms and model compressing techniques, but do not adequately address issues relating to processing massive datasets. In this article, we propose ClipDL, which accelerates the deep learning systems by simultaneously decreasing the number of model parameters as well as reducing the computations on critical data only. The core component of ClipDL is the estimation of critical set based on the observation that large proportions of input data have little influence on model parameter updating in many prevalent DL algorithms. We implemented ClipDL on Spark (a popular distributed engine for big data) and BigDL (based on de-factor distributed DL training architecture, parameter server), and integrated it with representative model compression techniques. The exhaustive experiments on real DL applications and datasets show ClipDL accelerates model training process by an average of 2.32 times while only incurring accuracy losses of 1.86 percent.

中文翻译:

通过关键集识别和模型压缩加速深度学习系统

越来越多地部署现代分布式引擎来加速大规模深度学习 (DL) 训练工作。虽然分布式工作者/节点的并行性保证了可扩展性,但底层迭代求解算法(例如随机梯度下降)的计算和通信开销不幸成为分布式 DL 训练工作的瓶颈。现有方法通过设计更有效的同步算法和模型压缩技术来解决这些限制,但没有充分解决与处理海量数据集相关的问题。在本文中,我们提出了 ClipDL,它通过同时减少模型参数的数量以及仅减少对关键数据的计算来加速深度学习系统。ClipDL 的核心组件是基于观察到的关键集的估计,即在许多流行的 DL 算法中,大量输入数据对模型参数更新几乎没有影响。我们在 Spark(一种流行的大数据分布式引擎)和 BigDL(基于去因式分布式 DL 训练架构、参数服务器)上实现了 ClipDL,并将其与具有代表性的模型压缩技术集成。对真实 DL 应用程序和数据集的详尽实验表明,ClipDL 将模型训练过程平均加速了 2.32 倍,而仅导致 1.86% 的精度损失。我们在 Spark(一种流行的大数据分布式引擎)和 BigDL(基于去因式分布式 DL 训练架构、参数服务器)上实现了 ClipDL,并将其与具有代表性的模型压缩技术集成。对真实 DL 应用程序和数据集的详尽实验表明,ClipDL 将模型训练过程平均加速了 2.32 倍,而仅导致 1.86% 的精度损失。我们在 Spark(一种流行的大数据分布式引擎)和 BigDL(基于去因式分布式 DL 训练架构、参数服务器)上实现了 ClipDL,并将其与具有代表性的模型压缩技术集成。对真实 DL 应用程序和数据集的详尽实验表明,ClipDL 将模型训练过程平均加速了 2.32 倍,而仅导致 1.86% 的精度损失。
更新日期:2020-01-01
down
wechat
bug