当前位置: X-MOL 学术J. Syst. Archit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimized co-scheduling of mixed-precision neural network accelerator for real-time multitasking applications
Journal of Systems Architecture ( IF 4.5 ) Pub Date : 2020-04-07 , DOI: 10.1016/j.sysarc.2020.101775
Wei Jiang , Ziwei Song , Jinyu Zhan , Zhiyuan He , Xiangyu Wen , Ke Jiang

Neural networks are increasingly applied into real-time and embedded Artificial Intelligent (AI) systems like autonomous driving system. Such resource-constrained systems cannot support the execution of neural network based tasks due to their high execution overheads on general processors. Hence, we are approaching to design real-time AI applications on embedded systems with CPU and FPGA (Field Programmable Gate Array) coprocessors. We use dedicated FPGA to accelerate the neural network job and utilize CPU to process the rest jobs of real-time multitasking applications. We devise an Idle-Aware Earliest Deadline First policy to co-schedule the AI applications on hybrid CPU and FPGA coprocessors. Since the implementation of neural network job on FPGA accelerator with different precision configuration will result in different execution time and accuracy, we are also interested in the design optimization of real-time AI applications running on mixed-precision neural network accelerator, with the purpose of maximizing the accuracy related rewards of all applications subject to real-time related constraints. We address the problem as a multi-stage decision procedure, and propose an efficient dynamic programming approach with two pruning policies to reduce the intermediate searching states. Extensive experiments and real-life case evaluations demonstrate the efficiency of the proposed approaches.



中文翻译:

实时多任务应用的混合精度神经网络加速器的优化协同调度

神经网络越来越多地应用于实时和嵌入式人工智能(AI)系统中,例如自动驾驶系统。这种资源受限的系统由于它们在通用处理器上的高执行开销而无法支持基于神经网络的任务的执行。因此,我们正在采用CPU和FPGA(现场可编程门阵列)协处理器在嵌入式系统上设计实时AI应用程序。我们使用专用的FPGA来加速神经网络工作,并利用CPU处理实时多任务应用程序的其余工作。我们设计了一种“空闲感知最早截止日期优先”策略,以在混合CPU和FPGA协处理器上共同调度AI应用程序。由于在具有不同精度配置的FPGA加速器上执行神经网络作业会导致不同的执行时间和精度,因此我们也对在混合精度神经网络加速器上运行的实时AI应用程序的设计优化感兴趣,目的是在与实时相关的约束下,使所有应用程序的与准确性相关的回报最大化。我们以多阶段决策程序解决该问题,并提出了一种有效的动态编程方法,该方法具有两个修剪策略以减少中间搜索状态。大量的实验和实际案例评估证明了所提出方法的有效性。目的是使所有应用程序在与实时相关的约束下获得与准确性相关的最高回报。我们以多阶段决策程序解决该问题,并提出了一种有效的动态编程方法,该方法具有两个修剪策略以减少中间搜索状态。大量的实验和实际案例评估证明了所提出方法的有效性。目的是使所有应用程序在与实时相关的约束下获得与准确性相关的最高回报。我们以多阶段决策程序解决该问题,并提出了一种有效的动态编程方法,该方法具有两个修剪策略以减少中间搜索状态。大量的实验和实际案例评估证明了所提出方法的有效性。

更新日期:2020-04-07
down
wechat
bug