当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Resistive Neural Hardware Accelerators
arXiv - CS - Hardware Architecture Pub Date : 2021-09-08 , DOI: arxiv-2109.03934
Kamilya Smagulova, Mohammed E. Fouda, Fadi Kurdahi, Khaled Salama, Ahmed Eltawil

Deep Neural Networks (DNNs), as a subset of Machine Learning (ML) techniques, entail that real-world data can be learned and that decisions can be made in real-time. However, their wide adoption is hindered by a number of software and hardware limitations. The existing general-purpose hardware platforms used to accelerate DNNs are facing new challenges associated with the growing amount of data and are exponentially increasing the complexity of computations. An emerging non-volatile memory (NVM) devices and processing-in-memory (PIM) paradigm is creating a new hardware architecture generation with increased computing and storage capabilities. In particular, the shift towards ReRAM-based in-memory computing has great potential in the implementation of area and power efficient inference and in training large-scale neural network architectures. These can accelerate the process of the IoT-enabled AI technologies entering our daily life. In this survey, we review the state-of-the-art ReRAM-based DNN many-core accelerators, and their superiority compared to CMOS counterparts was shown. The review covers different aspects of hardware and software realization of DNN accelerators, their present limitations, and future prospectives. In particular, comparison of the accelerators shows the need for the introduction of new performance metrics and benchmarking standards. In addition, the major concerns regarding the efficient design of accelerators include a lack of accuracy in simulation tools for software and hardware co-design.

中文翻译:

电阻神经硬件加速器

深度神经网络 (DNN) 作为机器学习 (ML) 技术的一个子集,需要可以学习现实世界的数据并可以实时做出决策。然而,它们的广泛采用受到许多软件和硬件限制的阻碍。用于加速 DNN 的现有通用硬件平台正面临与不断增长的数据量相关的新挑战,并且正在以指数方式增加计算的复杂性。新兴的非易失性存储器 (NVM) 设备和内存处理 (PIM) 范例正在创建具有增强计算和存储能力的新一代硬件架构。特别是,向基于 ReRAM 的内存计算的转变在实现区域和节能推理以及训练大规模神经网络架构方面具有巨大潜力。这些可以加速支持物联网的人工智能技术进入我们日常生活的进程。在本次调查中,我们回顾了最先进的基于 ReRAM 的 DNN 众核加速器,并展示了它们与 CMOS 同类产品相比的优势。该评论涵盖了 DNN 加速器硬件和软件实现的不同方面、它们目前的局限性和未来前景。特别是,加速器的比较表明需要引入新的性能指标和基准测试标准。此外,加速器高效设计的主要问题包括软件和硬件协同设计的仿真工具缺乏准确性。我们回顾了最先进的基于 ReRAM 的 DNN 众核加速器,并展示了它们与 CMOS 同行相比的优势。该评论涵盖了 DNN 加速器硬件和软件实现的不同方面、它们目前的局限性和未来前景。特别是,加速器的比较表明需要引入新的性能指标和基准测试标准。此外,加速器高效设计的主要问题包括软件和硬件协同设计的仿真工具缺乏准确性。我们回顾了最先进的基于 ReRAM 的 DNN 众核加速器,并展示了它们与 CMOS 同行相比的优势。该评论涵盖了 DNN 加速器硬件和软件实现的不同方面、它们目前的局限性和未来前景。特别是,加速器的比较表明需要引入新的性能指标和基准测试标准。此外,加速器高效设计的主要问题包括软件和硬件协同设计的仿真工具缺乏准确性。加速器的比较表明需要引入新的性能指标和基准测试标准。此外,加速器高效设计的主要问题包括软件和硬件协同设计的仿真工具缺乏准确性。加速器的比较表明需要引入新的性能指标和基准测试标准。此外,加速器高效设计的主要问题包括软件和硬件协同设计的仿真工具缺乏准确性。
更新日期:2021-09-10
down
wechat
bug