当前位置: X-MOL 学术Analog Integr. Circ. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast DSE of reconfigurable accelerator systems via ensemble machine learning
Analog Integrated Circuits and Signal Processing ( IF 1.2 ) Pub Date : 2021-05-28 , DOI: 10.1007/s10470-021-01885-0
Alba Lopes , Monica Pereira

Reconfigurable hardware accelerators (RAs) attached to processors have become a frequent choice to meet the performance demand of current embedded applications. However, answering when the combination of general purpose processors (GPPs) and RAs can provide the expected performance at the additional area and energy cost demands an extensive design space exploration (DSE). Performing DSE through hardware synthesis is an extremely time-consuming and costly task. High-level simulations are a faster and simpler DSE method at the cost of accuracy loss. Even so, the use of high-level simulation does not allow simulating all design solutions and meeting time-to-market. In this scenario, machine learning (ML) has become a promising solution to provide robustness to the DSE of large hardware designs by predicting aspects such as performance and energy. A main challenge in the design of a high-accuracy predictor is to select one ML algorithm to encompass a wide range of applications. In this context, ensemble learning is a promising solution since it can use multiple models and combine their predictions. In this work we employ the use of ensemble methods to simplify and speed up the DSE of GPPs with RAs. In our investigation, we evaluate three ensemble methods, Random Forest, AdaBoosting and Gradient Boosting. We compare them to the most used regression algorithms found in literature to perform DSE of computer architectures. Results show an error prediction rate below 2% for some benchmarks when using ensemble methods and a throughput of more than 6000 predictions per second when using Gradient Boosting.



中文翻译:

通过集成机器学习对可重构加速器系统进行快速 DSE

连接到处理器的可重构硬件加速器 (RA) 已成为满足当前嵌入式应用程序性能需求的常见选择。然而,要回答通用处理器 (GPP) 和 RA 的组合何时能够以额外的面积和能源成本提供预期的性能,需要进行广泛的设计空间探索 (DSE)。通过硬件综合执行 DSE 是一项极其耗时且成本高昂的任务。高级模拟是一种更快、更简单的 DSE 方法,但代价是精度损失。即便如此,使用高级仿真也无法对所有设计解决方案进行仿真并满足上市时间。在这种情况下,机器学习(ML)已成为一种有前途的解决方案,它可以通过预测性能和能耗等方面,为大型硬件设计的DSE提供鲁棒性。设计高精度预测器的一个主要挑战是选择一种 ML 算法来涵盖广泛的应用。在这种情况下,集成学习是一个很有前途的解决方案,因为它可以使用多个模型并结合它们的预测。在这项工作中,我们使用集成方法来简化和加速具有 RA 的 GPP 的 DSE。在我们的调查中,我们评估了三种集成方法,随机森林、AdaBoosting 和 Gradient Boosting。我们将它们与文献中最常用的回归算法进行比较,以执行计算机体系结构的 DSE。结果显示,当使用集成方法时,某些基准测试的错误预测率低于 2%,而使用梯度提升时的吞吐量超过每秒 6000 次预测。

更新日期:2021-05-28
down
wechat
bug