当前位置: X-MOL 学术IET Circuits, Devices Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dataflow and microarchitecture co-optimisation for sparse CNN on distributed processing element accelerator
IET Circuits, Devices & Systems ( IF 1.0 ) Pub Date : 2020-12-15 , DOI: 10.1049/iet-cds.2019.0225
Duc-An Pham, Bo-Cheng Lai

Accelerators that utilise the sparsity of both activation data and network structure of convolutional neural networks (CNNs) have demonstrated efficient processing of CNNs with superior performance. Previous research studies have shown three critical design concerns when designing accelerators for sparse CNNs, including data reuse, parallel computing performance, and effective sparse computation. These factors were each used in the previous accelerator designs, but none of the designs have considered all the factors at the same time. This study provides analytical approaches and experimental results to reveal the insight of accelerator design for sparse CNNs. The authors have shown that the architectural aspects need to be all considered to avoid performance pitfalls, including their mutual effects. Based on the proposed analytical approach, they proposed enhancement techniques and co-designed among the factors discussed in this study. The improved architecture shows up to 1.5× data reuse and/or 1.55× performance improvement in comparison with state-of-the-art sparse CNN accelerators while still maintaining equal area and energy cost.

中文翻译:

分布式处理元件加速器上稀疏CNN的数据流和微体系结构协同优化

利用激活数据的稀疏性和卷积神经网络(CNN)的网络结构的加速器已经证明了对CNN的高效处理,具有出色的性能。先前的研究表明,在设计用于稀疏CNN的加速器时,存在三个关键的设计问题,包括数据重用,并行计算性能和有效的稀疏计算。在先前的加速器设计中都使用了这些因素,但是没有一个设计同时考虑了所有因素。这项研究提供了分析方法和实验结果,以揭示针对稀疏CNN的加速器设计的见识。作者已经表明,必须全面考虑体系结构方面,以避免性能方面的陷阱,包括它们之间的相互影响。根据建议的分析方法,他们提出了增强技术,并在本研究中讨论的因素之间进行了共同设计。与最新的稀疏CNN加速器相比,改进的体系结构可显示高达1.5倍的数据重用率和/或1.55倍的性能提升,同时仍保持相同的面积和能源成本。
更新日期:2020-12-18
down
wechat
bug