当前位置: X-MOL 学术J. Sign. Process. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FARM: A Flexible Accelerator for Recurrent and Memory Augmented Neural Networks
Journal of Signal Processing Systems ( IF 1.8 ) Pub Date : 2020-06-24 , DOI: 10.1007/s11265-020-01555-w
Nagadastagiri Challapalle , Sahithi Rampalli , Nicholas Jao , Akshaykrishna Ramanathan , John Sampson , Vijaykrishnan Narayanan

Recently, Memory Augmented Neural Networks (MANN)s, a class of Deep Neural Networks (DNN)s have become prominent owing to their ability to capture the long term dependencies effectively for several Natural Language Processing (NLP) tasks. These networks augment conventional DNNs by incorporating memory and attention mechanisms external to the network to capture relevant information. Several MANN architectures have shown particular benefits in NLP tasks by augmenting an underlying Recurrent Neural Network (RNN) with external memory using attention mechanisms. Unlike conventional DNNs whose computational time is dominated by MAC operations, MANNs have more diverse behavior. In addition to MACs, the attention mechanisms of MANNs also consist of operations such as similarity measure, sorting, weighted memory access, and pair-wise arithmetic. Due to this greater diversity in operations, MANNs are not trivially accelerated by the same techniques used by existing DNN accelerators. In this work, we present an end-to-end hardware accelerator architecture, FARM, for the inference of RNNs and several variants of MANNs, such as the Differential Neural Computer (DNC), Neural Turing Machine (NTM) and Meta-learning model. FARM achieves an average speedup of 30x-190x and 80x-100x over CPU and GPU implementations, respectively. To address remaining memory bottlenecks in FARM, we then propose the FARM-PIM architecture, which augments FARM with in-memory compute support for MAC and content-similarity operations in order to reduce data traversal costs. FARM-PIM offers an additional speedup of 1.5x compared to FARM. Additionally, we consider an efficiency-oriented version of the PIM implementation, FARM-PIM-LP, that trades a 20% performance reduction relative to FARM for a 4x average power consumption reduction.



中文翻译:

FARM:用于循环和记忆增强神经网络的灵活加速器

最近,一类深层神经网络(DNN)的记忆增强神经网络(MANN)由于能够有效地捕获一些自然语言处理(NLP)任务的长期依存关系而变得非常重要。这些网络通过合并网络外部的内存和注意力机制来捕获相关信息,从而增强了传统的DNN。几种MANN架构通过使用注意机制通过外部内存增强底层的递归神经网络(RNN),在NLP任务中显示出特殊的优势。与传统DNN的计算时间主要由MAC运算决定,MANN具有更多不同的行为。除MAC之外,MANN的注意机制还包括诸如相似性度量,排序,加权内存访问和成对算术之类的操作。由于操作上的多样性,MANN不能通过现有DNN加速器所使用的相同技术来简单地加速。在这项工作中,我们提出了一种端到端的硬件加速器架构,FARM,用于推理RNN和MANN的多种变体,例如差分神经计算机(DNC),神经图灵机(NTM)和元学习模型。与CPU和GPU实施相比,FARM的平均速度分别提高了30x-190x和80x-100x。为了解决FARM中仍然存在的内存瓶颈,我们提出了FARM-PIM体系结构,该体系结构通过对MAC和内容相似性操作的内存内计算支持来增强FARM,以降低数据遍历成本。与FARM相比,FARM-PIM的速度提高了1.5倍。此外,我们考虑了以效率为导向的PIM实现版本FARM-PIM-LP,相对于FARM,其性能降低了20%,而平均功耗却降低了4倍。

更新日期:2020-06-25
down
wechat
bug