当前位置: X-MOL 学术IEEE Trans. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
All-digital control-theoretic scheme to optimize energy budget and allocation in multi-cores
IEEE Transactions on Computers ( IF 3.6 ) Pub Date : 2020-05-01 , DOI: 10.1109/tc.2019.2963859
Davide Zoni , Luca Cremona , William Fornaciari

The Internet-of-Things (IoT) revolution fueled new challenges and opportunities to achieve computational efficiency goals. Embedded devices are required to execute multiple applications for which a suitable distribution of the computing power must be adapted at run-time. Such complex hardware platforms have to sustain the continuous acquisition and processing of data under severe energy budget constraints, since most of them are battery powered. The state-of-the-art offers several ad-hoc contributions to selectively optimize the performance considering aspects like energy, power, thermal, or reliability. However, there is a need for a generic coordinated management strategy able to cope with all of these dimensions, while allowing the Operating System (OS) and the applications to “suggest” or constrain the actuation. This article proposes a unified control-theoretic scheme to coordinate the design of energy-budget and energy allocation solutions for multi-cores. The proposed controller can work with any actuator and it can interact, at run-time, with both the applications and the OS to optimize the actuation signals steering the computing platform. Such control scheme offers the possibility to integrate any performance related policy in the form of an energy-allocation strategy, still ensuring the theoretic exponential stability of the overall controller if the actuation of the policy, coming from the OS and the applications, “is not too fast.” To demonstrate the feasibility of our solution, we have implemented the controller into a RISC multi-core running on the Xilinx Artix 100t FPGA device, available in the the Digilent Nexys4-DDR board. Results considering two actuators and both the quad- and the eight-core version of the considered computing platform, highlight the scalability of the proposed solution as well as an area overhead for the –all digital, on chip–controller limited to 0.86 percent (FFs) and 5.3 percent (LUTs) of the FPGA chip. We also considered a dynamic scenario validating the speed of the controller, where our framework has to face with modifications to the energy-allocation control policy carried out by the OS and the applications. The obtained results are collected by executing a huge mix of benchmarks and the statistical significance is accounted by executing each scenario 30 times. Such results are analyzed considering three quality metrics. First, the efficiency in exploiting the imposed budget ($EFF_g$EFFg) that is on average 98.27 percent. Second, the overflow of the actual average power consumption with respect to the assigned budget ($OVF_g$OVFg), which is limited to 1.43 mW on average. Last, the performance utility loss due to the control scheme that is limited to 1.87 percent on average.

中文翻译:

优化多核能量预算和分配的全数字控制理论方案

物联网 (IoT) 革命为实现计算效率目标带来了新的挑战和机遇。嵌入式设备需要执行多个应用程序,这些应用程序必须在运行时调整计算能力的适当分布。这种复杂的硬件平台必须在严格的能源预算限制下维持数据的连续采集和处理,因为它们中的大多数是电池供电的。最先进的技术提供了多种临时贡献,以在考虑能源、功率、热或可靠性等方面有选择地优化性能。然而,需要一种能够处理所有这些维度的通用协调管理策略,同时允许操作系统 (OS) 和应用程序“建议”或限制驱动。本文提出了一个统一的控制理论方案来协调多核的能量预算和能量分配解决方案的设计。所提出的控制器可以与任何执行器一起工作,并且可以在运行时与应用程序和操作系统交互,以优化控制计算平台的驱动信号。这种控制方案提供了以能量分配策略的形式集成任何与性能相关的策略的可能性,如果来自操作系统和应用程序的策略驱动“不是太快。” 为了证明我们的解决方案的可行性,我们已将控制器实施到运行在赛灵思 Artix 100t FPGA 器件上的 RISC 多核中,该器件可在 Digilent Nexys4-DDR 板中使用。考虑到所考虑的计算平台的两个执行器以及四核和八核版本的结果,突出了所提出的解决方案的可扩展性以及限制为 0.86% (FFs) 的全数字片上控制器的面积开销) 和 5.3% (LUT) 的 FPGA 芯片。我们还考虑了验证控制器速度的动态场景,其中我们的框架必须面对操作系统和应用程序执行的能量分配控制策略的修改。通过执行大量基准测试收集获得的结果,并通过执行每个场景 30 次来计算统计显着性。考虑三个质量度量来分析这些结果。首先,利用强加预算的效率(强调所提议解决方案的可扩展性以及全数字片上控制器的面积开销限制为 FPGA 芯片的 0.86% (FFs) 和 5.3% (LUTs)。我们还考虑了验证控制器速度的动态场景,其中我们的框架必须面对操作系统和应用程序执行的能量分配控制策略的修改。通过执行大量基准测试收集获得的结果,并通过执行每个场景 30 次来计算统计显着性。考虑三个质量度量来分析这些结果。首先,利用强加预算的效率(强调所提议解决方案的可扩展性以及全数字片上控制器的面积开销限制为 FPGA 芯片的 0.86% (FFs) 和 5.3% (LUTs)。我们还考虑了验证控制器速度的动态场景,其中我们的框架必须面对操作系统和应用程序执行的能量分配控制策略的修改。通过执行大量基准测试收集获得的结果,并通过执行每个场景 30 次来计算统计显着性。考虑三个质量度量来分析这些结果。首先,利用强加预算的效率(FPGA 芯片的 3% (LUT)。我们还考虑了验证控制器速度的动态场景,其中我们的框架必须面对操作系统和应用程序执行的能量分配控制策略的修改。通过执行大量基准测试收集获得的结果,并通过执行每个场景 30 次来计算统计显着性。考虑三个质量度量来分析这些结果。首先,利用强加预算的效率(FPGA 芯片的 3% (LUT)。我们还考虑了验证控制器速度的动态场景,其中我们的框架必须面对操作系统和应用程序执行的能量分配控制策略的修改。通过执行大量基准测试收集获得的结果,并通过执行每个场景 30 次来计算统计显着性。考虑三个质量度量来分析这些结果。首先,利用强加预算的效率(通过执行大量基准测试收集获得的结果,并通过执行每个场景 30 次来计算统计显着性。考虑三个质量度量来分析这些结果。首先,利用强加预算的效率(通过执行大量基准测试收集获得的结果,并通过执行每个场景 30 次来计算统计显着性。考虑三个质量度量来分析这些结果。首先,利用强加预算的效率($EFF_g$FFG) 平均为 98.27%。其次,实际平均功耗相对于分配预算的溢出($OVF_g$FG),平均限制在 1.43 mW。最后,由于控制方案导致的性能效用损失平均限制在 1.87%。
更新日期:2020-05-01
down
wechat
bug