Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Wan, Weier; Kubendran, Rajkumar; Schaefer, Clemens; Eryilmaz, S. Burc; Zhang, Wenqiang; Wu, Dabin; Deiss, Stephen; Raina, Priyanka; Qian, He; Gao, Bin; Joshi, Siddharth; Wu, Huaqiang; Wong, H. -S. Philip; Cauwenberghs, Gert

Computer Science > Hardware Architecture

arXiv:2108.07879 (cs)

[Submitted on 17 Aug 2021]

Title:Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Authors:Weier Wan (1), Rajkumar Kubendran (2 and 5), Clemens Schaefer (4), S. Burc Eryilmaz (1), Wenqiang Zhang (3), Dabin Wu (3), Stephen Deiss (2), Priyanka Raina (1), He Qian (3), Bin Gao (3), Siddharth Joshi (4 and 2), Huaqiang Wu (3), H.-S. Philip Wong (1), Gert Cauwenberghs (2) ((1) Stanford University, (2) University of California San Diego, (3) Tsinghua University, (4) University of Notre Dame, (5) University of Pittsburgh)

View PDF

Abstract:Realizing today's cloud-level artificial intelligence functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency. Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory. However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM - the first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility for diverse model architectures, record energy-efficiency $5\times$ - $8\times$ better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0% on MNIST and 85.7% on CIFAR-10 image classification, 84.7% accuracy on Google speech command recognition, and a 70% reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future.

Comments:	34 pages, 14 figures, 1 table
Subjects:	Hardware Architecture (cs.AR); Artificial Intelligence (cs.AI); Emerging Technologies (cs.ET); Machine Learning (cs.LG)
Cite as:	arXiv:2108.07879 [cs.AR]
	(or arXiv:2108.07879v1 [cs.AR] for this version)
	https://doi.org/10.48550/arXiv.2108.07879

Submission history

From: Weier Wan [view email]
[v1] Tue, 17 Aug 2021 21:08:51 UTC (4,161 KB)

Computer Science > Hardware Architecture

Title:Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Hardware Architecture

Title:Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators