当前位置: X-MOL 学术J. Sign. Process. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Increased Leverage of Transprecision Computing for Machine Vision Applications at the Edge
Journal of Signal Processing Systems ( IF 1.6 ) Pub Date : 2022-06-30 , DOI: 10.1007/s11265-022-01784-1
Umar Ibrahim Minhas , JunKyu Lee , Lev Mukhanov , Georgios Karakonstantis , Hans Vandierendonck , Roger Woods

The practical deployment of machine vision presents particular challenges for resource constrained edge devices. With a clear need to execute multiple tasks with variable workloads, there is a need for a robust approach that can dynamically adapt at runtime and which can maintain the maximum quality of service (QoS) within the available resource constraints. A lightweight approach that monitors the runtime workload constraints and leverages accuracy-throughput trade-offs on a graphics processing unit (GPU), is presented. It includes optimisation techniques that identify the configurations for each task in terms of optimal accuracy, energy and memory and management of the transparent switching between configurations. Using a neural network architecture search that statically generates a range of implementations that target a resource-precision trade-off, we explore the detection of the optimal parameters for the required QoS under specific memory and energy constraints. For an accuracy loss of 1%, we demonstrate that a \(1.6\times\) higher frame processing rate can be achieved on GPU with further improvements possible at further relaxed accuracy. In order to further improve the switching between configurations, we enhance the proposed mechanism by employing central processing units (CPUs) for offloading some of the executed frames, which helps to improve the frame rate by further 0.9%.



中文翻译:

提高边缘机器视觉应用的跨精度计算的杠杆作用

机器视觉的实际部署对资源受限的边缘设备提出了特殊的挑战。由于明确需要执行具有可变工作负载的多个任务,因此需要一种能够在运行时动态适应并且能够在可用资源限制内保持最大服务质量 (QoS) 的稳健方法。提出了一种监视运行时工作负载约束并利用图形处理单元 (GPU) 上的精度-吞吐量权衡的轻量级方法。它包括根据最佳精度、能量和内存以及配置之间透明切换的管理来识别每个任务的配置的优化技术。使用静态生成一系列以资源精度权衡为目标的实现的神经网络架构搜索,我们探索了在特定内存和能量约束下检测所需 QoS 的最佳参数。对于 1% 的准确度损失,我们证明了在 GPU 上可以实现\(1.6\times\)更高的帧处理速率,并且可以在进一步放松的精度下进一步改进。为了进一步改善配置之间的切换,我们通过使用中央处理器(CPU)来卸载一些已执行的帧来增强所提出的机制,这有助于将帧速率进一步提高 0.9%。

更新日期:2022-07-01
down
wechat
bug