当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A New MRAM-based Process In-Memory Accelerator for Efficient Neural Network Training with Floating Point Precision
arXiv - CS - Hardware Architecture Pub Date : 2020-03-02 , DOI: arxiv-2003.01551
Hongjie Wang, Yang Zhao, Chaojian Li, Yue Wang, Yingyan Lin

The excellent performance of modern deep neural networks (DNNs) comes at an often prohibitive training cost, limiting the rapid development of DNN innovations and raising various environmental concerns. To reduce the dominant data movement cost of training, process in-memory (PIM) has emerged as a promising solution as it alleviates the need to access DNN weights. However, state-of-the-art PIM DNN training accelerators employ either analog/mixed signal computing which has limited precision or digital computing based on a memory technology that supports limited logic functions and thus requires complicated procedure to realize floating point computation. In this paper, we propose a spin orbit torque magnetic random access memory (SOT-MRAM) based digital PIM accelerator that supports floating point precision. Specifically, this new accelerator features an innovative (1) SOT-MRAM cell, (2) full addition design, and (3) floating point computation. Experiment results show that the proposed SOT-MRAM PIM based DNN training accelerator can achieve 3.3$\times$, 1.8$\times$, and 2.5$\times$ improvement in terms of energy, latency, and area, respectively, compared with a state-of-the-art PIM based DNN training accelerator.

中文翻译:

一种新的基于 MRAM 的进程内存加速器,用于具有浮点精度的高效神经网络训练

现代深度神经网络 (DNN) 的出色性能往往伴随着高昂的训练成本,从而限制了 DNN 创新的快速发展并引发了各种环境问题。为了减少训练的主要数据移动成本,内存过程 (PIM) 已成为一种很有前途的解决方案,因为它减轻了访问 DNN 权重的需要。然而,最先进的PIM DNN训练加速器采用精度有限的模拟/混合信号计算或基于支持有限逻辑功能的存储器技术的数字计算,因此需要复杂的过程来实现浮点计算。在本文中,我们提出了一种基于自旋轨道扭矩磁性随机存取存储器 (SOT-MRAM) 的支持浮点精度的数字 PIM 加速器。具体来说,这种新型加速器具有创新的 (1) SOT-MRAM 单元、(2) 全加设计和 (3) 浮点计算。实验结果表明,所提出的基于 SOT-MRAM PIM 的 DNN 训练加速器可以在能量、延迟和面积方面分别实现 3.3$\times$、1.8$\times$ 和 2.5$\times$ 的改进,与 a最先进的基于 PIM 的 DNN 训练加速器。
更新日期:2020-05-13
down
wechat
bug