当前位置: X-MOL 学术IEEE Trans. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-Target Adaptive Reconfigurable Acceleration for Low-Power IoT Processing
IEEE Transactions on Computers ( IF 3.6 ) Pub Date : 2020-04-02 , DOI: 10.1109/tc.2020.2984736
Marcelo Brandalero 1 , Luigi Carro 2 , Antonio Carlos Schneider Beck 2 , Muhammad Shafique 3
Affiliation  

Low-power processors for the Internet-of-Things (IoT) demand a high degree of adaptability to efficiently execute applications with different resource requirements under varying scenarios. Current single-ISA heterogeneous Chip Multiprocessors (CMPs), such as ARM's big.LITTLE, provide multiple cores and voltage/frequency levels to address this challenge. However, finding the best possible type of core and the corresponding voltage/frequency level for all the execution scenarios, which involve different applications and phases, remains far from being reached. In this article, we propose extending such a single-ISA heterogeneous CMP with a Coarse-Grained Reconfigurable Array (CGRA) and a hardware-based dynamic binary translation (DBT) module that transparently maps application code onto the CGRA for acceleration. To achieve low-energy levels and efficiently manage the power consumption of the CGRA, we introduce an additional voltage rail that enables operation in the Near-Threshold Voltage (NTV) regime when needed, leveraging key features of the CGRA's structure to address the implementation challenges of NTV computing. For less than 35 percent area overhead to the baseline CMP, performance and energy consumption are improved as follows. Compared to: (a) power-efficient execution in the LITTLE core, MuTARe achieves 29 percent reduction in energy consumption, and 2×2\times speedup; (b) performance-efficient execution in the big core, a speedup of 1.6×1.6\times with an energy reduction of 41 percent is achieved.

中文翻译:


用于低功耗物联网处理的多目标自适应可重构加速



物联网(IoT)低功耗处理器需要高度的适应性,以在不同场景下高效执行具有不同资源需求的应用程序。当前的单 ISA 异构芯片多处理器 (CMP),例如 ARM 的 big.LITTLE,提供多个内核和电压/频率级别来应对这一挑战。然而,为涉及不同应用和阶段的所有执行场景找到最佳可能的内核类型和相应的电压/频率水平仍然遥不可及。在本文中,我们建议使用粗粒度可重构阵列 (CGRA) 和基于硬件的动态二进制转换 (DBT) 模块来扩展此类单 ISA 异构 CMP,该模块可透明地将应用程序代码映射到 CGRA 上以实现加速。为了实现低能耗水平并有效管理 CGRA 的功耗,我们引入了一个额外的电压轨,可在需要时在近阈值电压 (NTV) 状态下运行,利用 CGRA 结构的关键特性来应对实施挑战NTV 计算。与基准 CMP 相比,面积开销不到 35%,性能和能耗得到如下改进。与:(a) LITTLE 内核中的高能效执行相比,MuTARe 实现了 29% 的能耗降低和 2×2 倍的加速; (b) 大核中的高性能执行,实现了 1.6×1.6\times 的加速,同时能耗降低了 41%。
更新日期:2020-04-02
down
wechat
bug