Data-Aware Compression of Neural Networks,IEEE Computer Architecture Letters

当前位置： X-MOL 学术 › IEEE Comput. Archit. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data-Aware Compression of Neural Networks
IEEE Computer Architecture Letters ( IF 2.3 ) Pub Date : 2021-07-13 , DOI: 10.1109/lca.2021.3096191
Hajar Falahati , Masoud Peyro , Hossein Amini , Mehran Taghian , Mohammad Sadrosadati , Pejman Lotfi-Kamran , Hamid Sarbazi-Azad

Deep Neural networks (DNNs) are getting deeper and larger which intensify the data movement and compute demands. Prior work focuses on reducing data movements and computation through exploiting sparsity and similarity. However, none of them exploit input similarity and only focus on sparsity and weight similarity. Synergistically analysing the similarity and sparsity of inputs and weights, we show that memory accesses and computations can be reduced by 5.7× and 4.1×, more than what can be decreased by exploiting only sparsity, and 3.9× and 2.1×, more than what can be decreased by exploiting only weight similarity. We propose a new data-aware compression approach, called DANA , to effectively utilize both sparsity and similarity in inputs and weights. DANA can be orthogonally implemented on top of different hardware DNN accelerators. As an example, we implement DANA on top of an Eyeriss-like architecture. Our results over four famous DNNs reveal that DANA outperforms Eyeriss in terms of average performance and energy consumption by 18× and 83×, respectively. Moreover, DANA is faster than the state-of-the-art sparsity-aware and similarity-aware techniques by respectively 4.6× and 4.5×, and reduces the average energy consumption over them by 3.0× and 5.8×.

中文翻译：

神经网络的数据感知压缩

深度神经网络 (DNN) 变得越来越深，越来越大，这加剧了数据移动和计算需求。先前的工作侧重于通过利用稀疏性和相似性来减少数据移动和计算。然而，他们都没有利用输入相似性，只关注稀疏性和权重相似性。协同分析输入和权重的相似性和稀疏性，我们表明内存访问和计算可以减少 5.7 倍和 4.1 倍，超过仅利用稀疏性可以减少的数量，以及 3.9 倍和 2.1 倍，超过可以减少的数量通过仅利用权重相似性来减少。我们提出了一种新的数据感知压缩方法，称为DANA ，有效利用输入和权重的稀疏性和相似性。 DANA 可以在不同的硬件 DNN 加速器之上正交实施。例如，我们实现DANA 位于类似 Eyeriss 的架构之上。我们对四个著名 DNN 的结果表明DANA 在平均性能和能耗方面分别优于 Eyeriss 18 倍和 83 倍。而且，DANA 分别比最先进的稀疏感知和相似性感知技术快 4.6 倍和 4.5 倍，并将平均能耗降低 3.0 倍和 5.8 倍。

更新日期：2021-07-30

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>