Koios: A Deep Learning Benchmark Suite for FPGA Architecture and CAD Research,arXiv - CS - Hardware Architecture

当前位置： X-MOL 学术 › arXiv.cs.AR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Koios: A Deep Learning Benchmark Suite for FPGA Architecture and CAD Research
arXiv - CS - Hardware Architecture Pub Date : 2021-06-13 , DOI: arxiv-2106.07087
Aman Arora, Andrew Boutros, Daniel Rauch, Aishwarya Rajen, Aatman Borda, Seyed Alireza Damghani, Samidh Mehta, Sangram Kate, Pragnesh Patel, Kenneth B. Kent, Vaughn Betz, Lizy K. John

With the prevalence of deep learning (DL) in many applications, researchers are investigating different ways of optimizing FPGA architecture and CAD to achieve better quality-of-results (QoR) on DL-based workloads. In this optimization process, benchmark circuits are an essential component; the QoR achieved on a set of benchmarks is the main driver for architecture and CAD design choices. However, current academic benchmark suites are inadequate, as they do not capture any designs from the DL domain. This work presents a new suite of DL acceleration benchmark circuits for FPGA architecture and CAD research, called Koios. This suite of 19 circuits covers a wide variety of accelerated neural networks, design sizes, implementation styles, abstraction levels, and numerical precisions. These designs are larger, more data parallel, more heterogeneous, more deeply pipelined, and utilize more FPGA architectural features compared to existing open-source benchmarks. This enables researchers to pin-point architectural inefficiencies for this class of workloads and optimize CAD tools on more realistic benchmarks that stress the CAD algorithms in different ways. In this paper, we describe the designs in our benchmark suite, present results of running them through the Verilog-to-Routing (VTR) flow using a recent FPGA architecture model, and identify key insights from the resulting metrics. On average, our benchmarks have 3.7x more netlist primitives, 1.8x and 4.7x higher DSP and BRAM densities, and 1.7x higher frequency with 1.9x more near-critical paths compared to the widely-used VTR suite. Finally, we present two example case studies showing how architectural exploration for DL-optimized FPGAs can be performed using our new benchmark suite.

中文翻译：

Koios：用于 FPGA 架构和 CAD 研究的深度学习基准套件

随着深度学习 (DL) 在许多应用中的流行，研究人员正在研究优化 FPGA 架构和 CAD 的不同方法，以在基于 DL 的工作负载上实现更好的结果质量 (QoR)。在这个优化过程中，基准电路是必不可少的组成部分；在一组基准上实现的 QoR 是架构和 CAD 设计选择的主要驱动因素。然而，当前的学术基准套件是不够的，因为它们没有从 DL 领域捕获任何设计。这项工作为 FPGA 架构和 CAD 研究提供了一套新的 DL 加速基准电路，称为 Koios。这一套 19 个电路涵盖了各种加速神经网络、设计大小、实现风格、抽象级别和数值精度。这些设计更大、数据更并行、更异构、与现有的开源基准测试相比，更深入的流水线，并利用更多的 FPGA 架构特性。这使研究人员能够查明此类工作负载的架构低效问题，并根据更现实的基准优化 CAD 工具，这些基准以不同方式对 CAD 算法施加压力。在本文中，我们描述了基准套件中的设计，展示了使用最新的 FPGA 架构模型通过 Verilog-to-Routing (VTR) 流程运行它们的结果，并从结果指标中确定关键见解。平均而言，与广泛使用的 VTR 套件相比，我们的基准测试具有 3.7 倍的网表原语、1.8 倍和 4.7 倍的 DSP 和 BRAM 密度以及 1.7 倍的频率和 1.9 倍的近关键路径。最后，

更新日期：2021-06-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>