A Survey on Hardware Accelerators and Optimization Techniques for RNNs,Journal of Systems Architecture

当前位置： X-MOL 学术 › J. Syst. Archit. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Survey on Hardware Accelerators and Optimization Techniques for RNNs
Journal of Systems Architecture ( IF 3.7 ) Pub Date : 2020-07-18 , DOI: 10.1016/j.sysarc.2020.101839
Sparsh Mittal , Sumanth Umesh

“Recurrent neural networks” (RNNs) are powerful artificial intelligence models that have shown remarkable effectiveness in several tasks such as music generation, speech recognition and machine translation. RNN computations involve both intra-timestep and inter-timestep dependencies. Due to these features, hardware acceleration of RNNs is more challenging than that of CNNs. Recently, several researchers have proposed hardware architectures for RNNs. In this paper, we present a survey of GPU/FPGA/ASIC-based accelerators and optimization techniques for RNNs. We highlight the key ideas of different techniques to bring out their similarities and differences. Improvements in deep-learning algorithms have inevitably gone hand-in-hand with the improvements in the hardware-accelerators. Nevertheless, there is a need and scope of even greater synergy between these two fields. This survey seeks to synergize the efforts of researchers in the area of deep learning, computer architecture, and chip-design.

中文翻译：

RNN的硬件加速器和优化技术的调查

“递归神经网络”（RNN）是强大的人工智能模型，已在音乐生成，语音识别和机器翻译等多项任务中显示出显着的有效性。RNN计算涉及时间步内和时间步间相关性。由于这些功能，RNN的硬件加速比CNN更具挑战性。最近，一些研究人员提出了RNN的硬件架构。在本文中，我们对RNN的基于GPU / FPGA / ASIC的加速器和优化技术进行了概述。我们重点介绍了不同技术的关键思想，以揭示它们的异同。深度学习算法的改进不可避免地与硬件加速器的改进并驾齐驱。不过，这两个领域之间存在更大协同作用的需求和范围。这项调查旨在加深研究人员在深度学习，计算机体系结构和芯片设计领域的努力。

更新日期：2020-07-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11