当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators
IEEE Access ( IF 3.9 ) Pub Date : 2020-01-01 , DOI: 10.1109/access.2020.3022327
Paniti Achararit , Muhammad Abdullah Hanif , Rachmad Vidya Wicaksana Putra , Muhammad Shafique , Yuko Hara-Azumi

Designing resource-efficient deep neural networks (DNNs) is a challenging task due to the enormous diversity of applications as well as their time-consuming design, training, optimization, and evaluation cycles, especially the resource-constrained embedded systems. To address these challenges, we propose a novel DNN design framework called accuracy-and-performance-aware neural architecture search (APNAS), which can generate DNNs efficiently, as it does not require hardware devices or simulators while searching for optimized DNN model configurations that offer both inference accuracy and high execution performance. In addition, to accelerate the process of DNN generation, APNAS is built on a weight sharing and reinforcement learning-based exploration methodology, which is composed of a recurrent neural network controller as its core to generate sample DNN configurations. The reward in reinforcement learning is formulated as a configurable function to consider the sample DNNs’ accuracy and cycle count required to run on a target hardware architecture. To further expedite the DNN generation process, we devise analytical models for cycle count estimation instead of running millions of DNN configurations on real hardware. We demonstrate that these analytical models are highly accurate and provide cycle count estimates identical to those of a cycle-accurate hardware simulator. Experiments that involve quantitatively varying hardware constraints demonstrate that APNAS requires only 0.55 graphics processing unit (GPU) days on a single Nvidia GTX 1080Ti GPU to generate DNNs that offer an average of 53% fewer cycles with negligible accuracy degradation (on average 3%) for image classification compared to state-of-the-art techniques.

中文翻译:

APNAS:准确度和性能感知神经架构搜索神经硬件加速器

由于应用程序的巨大多样性及其耗时的设计、训练、优化和评估周期,特别是资源受限的嵌入式系统,设计资源高效的深度神经网络 (DNN) 是一项具有挑战性的任务。为了应对这些挑战,我们提出了一种称为精度和性能感知神经架构搜索 (APNAS) 的新型 DNN 设计框架,该框架可以高效地生成 DNN,因为它在搜索优化的 DNN 模型配置时不需要硬件设备或模拟器提供推理准确性和高执行性能。此外,为了加速 DNN 生成过程,APNAS 建立在基于权重共享和强化学习的探索方法之上,它以循环神经网络控制器为核心,生成样本 DNN 配置。强化学习中的奖励被制定为一个可配置的函数,以考虑在目标硬件架构上运行所需的样本 DNN 的准确性和循环计数。为了进一步加快 DNN 生成过程,我们设计了用于循环计数估计的分析模型,而不是在真实硬件上运行数百万个 DNN 配置。我们证明这些分析模型高度准确,并提供与周期精确硬件模拟器相同的周期计数估计。涉及定量变化的硬件约束的实验表明 APNAS 只需要 0。
更新日期:2020-01-01
down
wechat
bug