当前位置: X-MOL 学术J. Parallel Distrib. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient traversal of decision tree ensembles with FPGAs
Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2021-05-10 , DOI: 10.1016/j.jpdc.2021.04.008
Romina Molina , Fernando Loor , Veronica Gil-Costa , Franco Maria Nardini , Raffaele Perego , Salvatore Trani

System-on-Chip (SoC) based Field Programmable Gate Arrays (FPGAs) provide a hardware acceleration technology that can be rapidly deployed and tuned, thus providing a flexible solution adaptable to specific design requirements and to changing demands. In this paper, we present three SoC architecture designs for speeding-up inference tasks based on machine learned ensembles of decision trees. We focus on QuickScorer, the state-of-the-art algorithm for the efficient traversal of tree ensembles and present the issues and the advantages related to its deployment on two SoC devices with different capacities. The results of the experiments conducted using publicly available datasets show that the solution proposed is very efficient and scalable. More importantly, it provides almost constant inference times, independently of the number of trees in the model and the number of instances to score. This allows the SoC solution deployed to be fine tuned on the basis of the accuracy and latency constraints of the application scenario considered.



中文翻译:

使用FPGA高效遍历决策树

基于片上系统(SoC)的现场可编程门阵列(FPGA)提供了可以快速部署和调整的硬件加速技术,从而提供了适合特定设计要求和变化需求的灵活解决方案。在本文中,我们提出了三种SoC架构设计,用于基于机器学习的决策树集成加快推理任务。我们专注于QuickScorer,用于树遍历的最新算法的最新技术,并提出了与将其部署在具有不同容量的两个SoC器件上有关的问题和优势。使用公开数据集进行的实验结果表明,提出的解决方案非常有效且可扩展。更重要的是,它提供了几乎恒定的推理时间,而与模型中的树数和要评分的实例数无关。这允许根据所考虑的应用场景的准确性和延迟约束来对所部署的SoC解决方案进行微调。

更新日期:2021-05-14
down
wechat
bug