当前位置: X-MOL 学术ACM Trans. Embed. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
D y VED eep
ACM Transactions on Embedded Computing Systems ( IF 2 ) Pub Date : 2020-06-12 , DOI: 10.1145/3372882
Sanjay Ganapathy 1 , Swagath Venkataramani 2 , Giridhur Sriraman 1 , Balaraman Ravindran 3 , Anand Raghunathan 4
Affiliation  

Deep Neural Networks (DNNs) have advanced the state-of-the-art in a variety of machine learning tasks and are deployed in increasing numbers of products and services. However, the computational requirements of training and evaluating large-scale DNNs are growing at a much faster pace than the capabilities of the underlying hardware platforms that they are executed upon. To address this challenge, one promising approach is to exploit the error resilient nature of DNNs by skipping or approximating computations that have negligible impact on classification accuracy. Almost all prior efforts in this direction propose static DNN approximations by either pruning network connections, implementing computations at lower precision, or compressing weights. In this work, we propose <u>Dy</u>namic <u>V</u>ariable <u>E</u>ffort <u>Deep</u> Neural Networks (D y VED eep ) to reduce the computational requirements of DNNs during inference. Complementary to the aforementioned static approaches, DyVEDeep is a dynamic approach that exploits heterogeneity in the DNN inputs to improve their compute efficiency with comparable classification accuracy and without requiring any re-training. D y VED eep equips DNNs with dynamic effort mechanisms that identify computations critical to classifying a given input and focus computational effort only on the critical computations, while skipping or approximating the rest. We propose three dynamic effort mechanisms that operate at different levels of granularity viz. neuron, feature, and layer levels. We build D y VED eep versions of six popular image recognition benchmarks (CIFAR-10, AlexNet, OverFeat, VGG-16, SqueezeNet, and Deep-Compressed-AlexNet) within the Caffe deep-learning framework. We evaluate D y VED eep on two platforms—a high-performance server with a 2.7 GHz Intel Xeon E5-2680 processor and 128 GB memory, and a low-power Raspberry Pi board with an ARM Cortex A53 processor and 1 GB memory. Across all benchmarks, D y VED eep achieves 2.47×--5.15× reduction in the number of scalar operations, which translates to 1.94×--2.23× and 1.46×--3.46× performance improvement over well-optimized baselines on the Xeon server and the Raspberry Pi, respectively, with comparable classification accuracy.

中文翻译:

D y VED eep

深度神经网络 (DNN) 在各种机器学习任务中取得了最新进展,并被部署在越来越多的产品和服务中。然而,训练和评估大规模 DNN 的计算需求的增长速度远远快于执行它们的底层硬件平台的能力。为了应对这一挑战,一种有前途的方法是通过跳过或近似对分类准确性影响可忽略不计的计算来利用 DNN 的错误弹性特性。几乎所有在这个方向上的先前努力都通过修剪网络连接、以较低精度实现计算或压缩权重来提出静态 DNN 近似。在这项工作中,我们提出 <u>Dy</u>namic <u>V</u>ariable <u>E</u>是的VED) 以减少 DNN 在推理过程中的计算要求。作为上述静态方法的补充,DyVEDeep 是一种动态方法,它利用 DNN 输入中的异质性来提高其计算效率,同时具有可比的分类精度,并且不需要任何重新训练。D是的VED为 DNN 配备了动态工作机制,该机制可以识别对给定输入进行分类的关键计算,并将计算工作仅集中在关键计算上,而跳过或近似其余计算。我们提出了三种动态努力机制,它们在不同的粒度级别上运行,即。神经元、特征和层级。我们建 D是的VEDCaffe 深度学习框架内的六个流行图像识别基准(CIFAR-10、AlexNet、OverFeat、VGG-16、SqueezeNet 和 Deep-Compressed-AlexNet)的版本。我们评估 D是的VED在两个平台上 - 具有 2.7 GHz Intel Xeon E5-2680 处理器和 128 GB 内存的高性能服务器,以及具有 ARM Cortex A53 处理器和 1 GB 内存的低功耗 Raspberry Pi 板。在所有基准测试中,D是的VED标量运算的数量减少了 2.47×--5.15 倍,这意味着在 Xeon 服务器和 Raspberry Pi 上的优化基准上分别提高了 1.94×--2.23× 和 1.46×--3.46× 性能,其中可比的分类精度。
更新日期:2020-06-12
down
wechat
bug