当前位置: X-MOL 学术arXiv.cs.DC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Caffe Barista: Brewing Caffe with FPGAs in the Training Loop
arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2020-06-18 , DOI: arxiv-2006.13829
Diederik Adriaan Vink, Aditya Rajagopal, Stylianos I. Venieris, Christos-Savvas Bouganis

As the complexity of deep learning (DL) models increases, their compute requirements increase accordingly. Deploying a Convolutional Neural Network (CNN) involves two phases: training and inference. With the inference task typically taking place on resource-constrained devices, a lot of research has explored the field of low-power inference on custom hardware accelerators. On the other hand, training is both more compute- and memory-intensive and is primarily performed on power-hungry GPUs in large-scale data centres. CNN training on FPGAs is a nascent field of research. This is primarily due to the lack of tools to easily prototype and deploy various hardware and/or algorithmic techniques for power-efficient CNN training. This work presents Barista, an automated toolflow that provides seamless integration of FPGAs into the training of CNNs within the popular deep learning framework Caffe. To the best of our knowledge, this is the only tool that allows for such versatile and rapid deployment of hardware and algorithms for the FPGA-based training of CNNs, providing the necessary infrastructure for further research and development.

中文翻译:

Caffe 咖啡师:在训练循环中使用 FPGA 酿造 Caffe

随着深度学习 (DL) 模型复杂性的增加,其计算需求也相应增加。部署卷积神经网络 (CNN) 涉及两个阶段:训练和推理。由于推理任务通常发生在资源受限的设备上,因此大量研究已经探索了自定义硬件加速器上的低功耗推理领域。另一方面,训练更需要计算和内存密集型,并且主要在大型数据中心的耗电 GPU 上执行。FPGA 上的 CNN 培训是一个新兴的研究领域。这主要是由于缺乏工具来轻松构建原型和部署各种硬件和/或算法技术,以实现节能的 CNN 训练。这项工作展示了咖啡师,一种自动化工具流,可将 FPGA 无缝集成到流行的深度学习框架 Caffe 内的 CNN 训练中。据我们所知,这是唯一一种能够为基于 FPGA 的 CNN 训练提供如此通用和快速的硬件和算法部署的工具,为进一步的研究和开发提供必要的基础设施。
更新日期:2020-06-25
down
wechat
bug