当前位置: X-MOL 学术IEEE Comput. Archit. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adapting In Situ Accelerators for Sparsity With Granular Matrix Reordering
IEEE Computer Architecture Letters ( IF 1.4 ) Pub Date : 2020-07-01 , DOI: 10.1109/lca.2020.3031907
Darya Mikhailenko , Yujin Nakamoto , Ben Feinberg , Engin Ipek

Neural network (NN) inference is an essential part of modern systems and is found at the heart of numerous applications ranging from image recognition to natural language processing. In situ NN accelerators can efficiently perform NN inference using resistive crossbars, which makes them a promising solution to the data movement challenges faced by conventional architectures. Although such accelerators demonstrate significant potential for dense NNs, they often do not benefit from sparse NNs, which contain relatively few non-zero weights. Processing sparse NNs on in situ accelerators results in wasted energy to charge the entire crossbar where most elements are zeros. To address this limitation, this letter proposes Granular Matrix Reordering (GMR): a preprocessing technique that enables an energy-efficient computation of sparse NNs on in situ accelerators. GMR reorders the rows and columns of sparse weight matrices to maximize the crossbars’ utilization and minimize the total number of crossbars needed to be charged. The reordering process does not rely on sparsity patterns and incurs no accuracy loss. Overall, GMR achieves an average of 28 percent and up to 34 percent reduction in energy consumption over seven pruned NNs across four different pruning methods and network architectures.

中文翻译:

使用粒度矩阵重新排序为稀疏性调整原位加速器

神经网络 (NN) 推理是现代系统的重要组成部分,是从图像识别到自然语言处理等众多应用程序的核心。原位 NN 加速器可以使用电阻交叉开关有效地执行 NN 推理,这使其成为解决传统架构面临的数据移动挑战的有前途的解决方案。尽管此类加速器展示了密集 NN 的巨大潜力,但它们通常无法从包含相对较少的非零权重的稀疏 NN 中受益。在原位加速器上处理稀疏的神经网络会浪费能量来为整个横杆充电,其中大多数元素为零。为了解决这个限制,这封信提出了粒度矩阵重新排序(GMR):一种预处理技术,可以在原位加速器上对稀疏神经网络进行节能计算。GMR 对稀疏权重矩阵的行和列进行重新排序,以最大化交叉开关的利用率并最小化需要充电的交叉开关总数。重新排序过程不依赖于稀疏模式并且不会导致准确性损失。总体而言,GMR 在四种不同的剪枝方法和网络架构的七个剪枝神经网络上平均降低了 28% 和高达 34% 的能耗。
更新日期:2020-07-01
down
wechat
bug