Automated optimization for memory‐efficient high‐performance deep neural network accelerators,ETRI Journal

当前位置： X-MOL 学术 › ETRI J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Automated optimization for memory‐efficient high‐performance deep neural network accelerators
ETRI Journal ( IF 1.3 ) Pub Date : 2020-07-29 , DOI: 10.4218/etrij.2020-0125
HyunMi Kim ₁ , Chun‐Gi Lyuh ₁ , Youngsu Kwon ₁

Affiliation

The increasing size and complexity of deep neural networks (DNNs) necessitate the development of efficient high‐performance accelerators. An efficient memory structure and operating scheme provide an intuitive solution for high‐performance accelerators along with dataflow control. Furthermore, the processing of various neural networks (NNs) requires a flexible memory architecture, programmable control scheme, and automated optimizations. We first propose an efficient architecture with flexibility while operating at a high frequency despite the large memory and PE‐array sizes. We then improve the efficiency and usability of our architecture by automating the optimization algorithm. The experimental results show that the architecture increases the data reuse; a diagonal write path improves the performance by 1.44× on average across a wide range of NNs. The automated optimizations significantly enhance the performance from 3.8× to 14.79× and further provide usability. Therefore, automating the optimization as well as designing an efficient architecture is critical to realizing high‐performance DNN accelerators.

中文翻译：

自动优化内存效率的高性能深度神经网络加速器

深度神经网络（DNN）的规模和复杂性不断增加，因此有必要开发高效的高性能加速器。高效的内存结构和操作方案为高性能加速器以及数据流控制提供了直观的解决方案。此外，各种神经网络（NN）的处理需要灵活的存储体系结构，可编程控制方案和自动优化。尽管内存和PE阵列尺寸较大，我们首先提出了一种高效，灵活的架构，同时可以在高频下运行。然后，我们通过自动化优化算法来提高体系结构的效率和可用性。实验结果表明，该架构增加了数据的重用性。对角线写入路径将性能提高了1。在广泛的NN中平均为44倍。自动化优化将性能从3.8倍提高到14.79倍，并进一步提供了可用性。因此，自动化优化以及设计有效的体系结构对于实现高性能DNN加速器至关重要。

更新日期：2020-07-29

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11