Space‐address decoupled scratchpad memory management for neural network accelerators,Concurrency and Computation: Practice and Experience

当前位置： X-MOL 学术 › Concurr. Comput. Pract. Exp. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Space‐address decoupled scratchpad memory management for neural network accelerators
Concurrency and Computation: Practice and Experience ( IF 2 ) Pub Date : 2020-10-13 , DOI: 10.1002/cpe.6046
Zhenxing Zhang _{1,

2,

3} , Shiyan Sun ₄ , Xunyu Chen ₃ , Tian Zhi ₂ , Qi Guo ₂ , Yunji Chen _{2,

5}

Affiliation

Deep neural networks have been demonstrated to be useful in varieties of intelligent tasks, and various specialized NN accelerators have been proposed recently to improve the hardware efficiency, which are typically equipped with software‐managed scratchpad memory (SPM) for high performance and energy efficiency. However, traditional SPM management techniques cause memory fragmentation for NN accelerators, and thus lead to low utilization of precious SPM. The main reason is that traditional techniques are originally designed for managing fixed‐length registers rather than variable‐length memory blocks. In this article, we propose a novel SPM management approach for NN accelerators. The basic intuition is that NN computation/memory behaviors are predictable and relatively regular compared with traditional applications, and thus most information can be determined at compile time. In addition, by exploiting the variable‐length feature of SPM, we propose to divide the allocation process into two passes: the space assignment and the address assignment pass, which are simultaneously (and implicitly) performed in traditional one‐pass allocation techniques. Experimental results on the memory requests of a representative NN accelerator demonstrate that the proposed approach can significantly reduce the memory consumption by 30% at most compared with state‐of‐the‐art SPM management techniques, and the memory usage is only 2% larger than that of the theoretical optimal allocation.

中文翻译：

用于神经网络加速器的空间地址解耦暂存器内存管理

深度神经网络已被证明可用于各种智能任务，并且最近已提出了各种专用的NN加速器来提高硬件效率，这些加速器通常配备了软件管理的暂存器（SPM），以实现高性能和高能效。但是，传统的SPM管理技术会导致NN加速器的内存碎片化，从而导致珍贵SPM的利用率低。主要原因是传统技术最初是为管理固定长度寄存器而不是可变长度存储块而设计的。在本文中，我们为NN加速器提出了一种新颖的SPM管理方法。基本直觉是，与传统应用程序相比，NN计算/内存行为是可预测的并且相对规则，因此可以在编译时确定大多数信息。此外，通过利用SPM的可变长度功能，我们建议将分配过程分为两步：空间分配和地址分配通过，这是在传统的一次通过分配技术中同时（隐式）执行的。对具有代表性的NN加速器的内存请求进行的实验结果表明，与最新的SPM管理技术相比，该方法最多可以将内存消耗最多减少30％，并且内存使用量仅比最新的SPM管理技术大2％。理论上的最优分配。

更新日期：2020-10-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>