A Survey on the Optimization of Neural Network Accelerators for Micro-AI On-Device Inference,IEEE Journal on Emerging and Selected Topics in Circuits and Systems

当前位置： X-MOL 学术 › IEEE J. Emerg. Sel. Top. Circuits Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Survey on the Optimization of Neural Network Accelerators for Micro-AI On-Device Inference
IEEE Journal on Emerging and Selected Topics in Circuits and Systems ( IF 3.7 ) Pub Date : 2021-11-25 , DOI: 10.1109/jetcas.2021.3129415
Arnab Neelim Mazumder , Jian Meng , Hasib-Al Rashid , Utteja Kallakuri , Xin Zhang , Jae-Sun Seo , Tinoosh Mohsenin

Deep neural networks (DNNs) are being prototyped for a variety of artificial intelligence (AI) tasks including computer vision, data analytics, robotics, etc. The efficacy of DNNs coincides with the fact that they can provide state-of-the-art inference accuracy for these applications. However, this advantage comes from the high computational complexity of the DNNs in use. Hence, it is becoming increasingly important to scale these DNNs so that they can fit on resource-constrained hardware and edge devices. The main goal is to allow efficient processing of the DNNs on low-power micro-AI platforms without compromising hardware resources and accuracy. In this work, we aim to provide a comprehensive survey about the recent developments in the domain of energy-efficient deployment of DNNs on micro-AI platforms. To this extent, we look at different neural architecture search strategies as part of micro-AI model design, provide extensive details about model compression and quantization strategies in practice, and finally elaborate on the current hardware approaches towards efficient deployment of the micro-AI models on hardware. The main takeaways for a reader from this article will be understanding of different search spaces to pinpoint the best micro-AI model configuration, ability to interpret different quantization and sparsification techniques, and the realization of the micro-AI models on resource-constrained hardware and different design considerations associated with it.

中文翻译：

用于微型人工智能设备端推理的神经网络加速器优化综述

深度神经网络 (DNN) 正在针对各种人工智能 (AI) 任务进行原型设计，包括计算机视觉、数据分析、机器人等。DNN 的功效与它们可以提供最先进的推理这一事实相吻合这些应用的准确性。然而，这一优势来自于所使用的 DNN 的高计算复杂度。因此，扩展这些 DNN 以便它们能够适应资源受限的硬件和边缘设备变得越来越重要。主要目标是允许在低功耗微型人工智能平台上高效处理 DNN，而不影响硬件资源和准确性。在这项工作中，我们的目标是对微型人工智能平台上 DNN 节能部署领域的最新进展进行全面调查。在这个意义上，我们将不同的神经架构搜索策略作为微型人工智能模型设计的一部分，提供有关实践中模型压缩和量化策略的广泛细节，最后详细阐述当前有效部署微型人工智能模型的硬件方法在硬件上。读者从本文中获得的主要收获将是了解不同的搜索空间以查明最佳的微型人工智能模型配置、解释不同量化和稀疏技术的能力，以及在资源受限的硬件和平台上实现微型人工智能模型。与之相关的不同设计考虑。

更新日期：2021-11-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11