当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transparent FPGA Acceleration with TensorFlow
arXiv - CS - Hardware Architecture Pub Date : 2021-02-02 , DOI: arxiv-2102.06018
Simon Pfenning, Philipp Holzinger, Marc Reichenbach

Today, artificial neural networks are one of the major innovators pushing the progress of machine learning. This has particularly affected the development of neural network accelerating hardware. However, since most of these architectures require specialized toolchains, there is a certain amount of additional effort for developers each time they want to make use of a new deep learning accelerator. Furthermore the flexibility of the device is bound to the architecture itself, as well as to the functionality of the runtime environment. In this paper we propose a toolflow using TensorFlow as frontend, thus offering developers the opportunity of using a familiar environment. On the backend we use an FPGA, which is addressable via an HSA runtime environment. In this way we are able to hide the complexity of controlling new hardware from the user, while at the same time maintaining a high amount of flexibility. This can be achieved by our HSA toolflow, since the hardware is not statically configured with the structure of the network. Instead, it can be dynamically reconfigured during runtime with the respective kernels executed by the network and simultaneously from other sources e.g. OpenCL/OpenMP.

中文翻译:

使用TensorFlow进行透明的FPGA加速

如今,人工神经网络是推动机器学习进步的主要创新者之一。这特别影响了神经网络加速硬件的开发。但是,由于这些架构中的大多数都需要专门的工具链,因此,每当开发人员想要使用新的深度学习加速器时,他们都会付出一定的额外努力。此外,设备的灵活性与体系结构本身以及运行时环境的功能有关。在本文中,我们提出了一个使用TensorFlow作为前端的工具流程,从而为开发人员提供了使用熟悉的环境的机会。在后端,我们使用FPGA,可通过HSA运行时环境对其进行寻址。这样,我们就可以向用户隐藏控制新硬件的复杂性,同时保持很高的灵活性。这可以通过我们的HSA工具流程来实现,因为硬件不是通过网络结构静态配置的。取而代之的是,它可以在运行时通过网络执行的相应内核动态地重新配置,并同时从其他来源(例如OpenCL / OpenMP)执行。
更新日期:2021-02-12
down
wechat
bug