当前位置: X-MOL 学术ACM Trans. Archit. Code Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On Predictable Reconfigurable System Design
ACM Transactions on Architecture and Code Optimization ( IF 1.6 ) Pub Date : 2021-02-10 , DOI: 10.1145/3436995
Nils Voss 1 , Bastiaan Kwaadgras 2 , Oskar Mencer 2 , Wayne Luk 3 , Georgi Gaydadjiev 4
Affiliation  

We propose a design methodology to facilitate rigorous development of complex applications targeting reconfigurable hardware. Our methodology relies on analytical estimation of system performance and area utilisation for a given specific application and a particular system instance consisting of a controlflow machine working in conjunction with one or more reconfigurable dataflow accelerators. The targeted application is carefully analyzed, and the parts identified for hardware acceleration are reimplemented as a set of representative software models. Next, with the results of the application analysis, a suitable system architecture is devised and its performance is evaluated to determine bottlenecks, allowing predictable design. The architecture is iteratively refined, until the final version satisfying the specification requirements in terms of performance and required hardware area is obtained. We validate the presented methodology using a widely accepted convolutional neural network (VGG-16) and an important HPC application (BQCD). In both cases, our methodology relieved and alleviated all system bottlenecks before the hardware implementation was started. As a result the architectures were implemented first time right, achieving state-of-the-art performance within 15% of our modelling estimations.

中文翻译:

论可预测的可重构系统设计

我们提出了一种设计方法,以促进针对可重构硬件的复杂应用程序的严格开发。我们的方法依赖于对给定特定应用程序和特定系统实例的系统性能和面积利用率的分析估计,该系统实例由与一个或多个可重新配置数据流加速器一起工作的控制流机器组成。仔细分析目标应用程序,并将识别出的硬件加速部分重新实现为一组具有代表性的软件模型。接下来,根据应用程序分析的结果,设计出合适的系统架构并评估其性能以确定瓶颈,从而实现可预测的设计。架构经过迭代细化,直到获得在性能和所需硬件面积方面满足规范要求的最终版本。我们使用广泛接受的卷积神经网络 (VGG-16) 和重要的 HPC 应用程序 (BQCD) 验证了所提出的方法。在这两种情况下,我们的方法在硬件实施开始之前缓解和缓解了所有系统瓶颈。因此,架构在第一次就正确实施,在我们建模估计的 15% 内实现了最先进的性能。
更新日期:2021-02-10
down
wechat
bug