当前位置: X-MOL 学术J. Sign. Process. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Floating Point CGRA based Ultra-Low Power DSP Accelerator
Journal of Signal Processing Systems ( IF 1.6 ) Pub Date : 2021-01-22 , DOI: 10.1007/s11265-020-01630-2
Rohit Prasad , Satyajit Das , Kevin J. M. Martin , Philippe Coussy

Coarse Grained Reconfigurable Arrays (CGRAs) are emerging as energy efficient accelerators providing a high grade of flexibility in both academia and industry. However, with the recent advancements in algorithms and performance requirements of applications, supporting only integer and logical arithmetic limits the interest of classical/traditional CGRAs. In this paper, we propose a novel CGRA architecture and associated compilation flow supporting both integer and floating-point computations for energy efficient acceleration of DSP applications. Experimental results show that the proposed accelerator achieves a maximum of 4.61× speedup compared to a DSP optimized, ultra low power RISC-V based CPU while executing seizure detection, a representative of wide range of EEG signal processing applications with an area overhead of 1.9×. The proposed CGRA achieves a maximum of 6.5× energy efficiency compared to the single core CPU. While comparing the execution with the multi-core CPU with 8 cores, the proposed CGRA achieves up to 4.4× energy gain.



中文翻译:

基于浮点CGRA的超低功耗DSP加速器

粗粒度可重构阵列(CGRA)逐渐成为节能型加速器,可为学术界和工业界提供高度的灵活性。但是,随着算法和应用程序性能要求的最新发展,仅支持整数和逻辑算术限制了经典/传统CGRA的兴趣。在本文中,我们提出了一种新颖的CGRA架构以及相关的编译流程,该流程支持整数和浮点计算,以实现DSP应用的节能加速。实验结果表明,与DSP优化,基于超低功耗RISC-V的CPU在执行癫痫检测时相比,所建议的加速器最高可实现4.61倍的加速,这是面积为1.9倍的各种EEG信号处理应用的代表。与单核CPU相比,建议的CGRA最高可实现6.5倍的能源效率。与具有8核的多核CPU的执行情况进行比较时,建议的CGRA可实现高达4.4倍的能量增益。

更新日期:2021-01-22
down
wechat
bug