High-Performance Mixed-Low-Precision CNN Inference Accelerator on FPGA,IEEE Micro

当前位置： X-MOL 学术 › IEEE Micro › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

High-Performance Mixed-Low-Precision CNN Inference Accelerator on FPGA
IEEE Micro ( IF 2.8 ) Pub Date : 2021-05-19 , DOI: 10.1109/mm.2021.3081735
Junbin Wang ₁ , Shaoxia Fang ₁ , Xi Wang ₁ , Jiangsha Ma ₁ , Taobo Wang ₁ , Yi Shan ₁

Affiliation

Low-precision techniques can effectively reduce the computational complexity and bandwidth requirements of a convolutional neural network (CNN) inference, but may lead to significant accuracy degradation. Mixed-low-precision techniques provide a superior approach for CNN inference since it can take the advantages of low precision while maintaining accuracy. In this article, we propose a high-performance, highly flexible ${W^8A^8}$W8A8 (INT8 weight and INT8 activation) and ${W^T A^2}$WTA2 (TERNARY weight and INT2 activation) mixed-precision CNN inference hardware architecture, DPUmxp, designed and implemented on Xilinx Virtex UltraScale+13P FPGA with peak performance up to 58.9 TOPS.

中文翻译：

FPGA 上的高性能混合低精度 CNN 推理加速器

低精度技术可以有效降低卷积神经网络（CNN）推理的计算复杂性和带宽要求，但可能会导致精度显着下降。混合低精度技术为 CNN 推理提供了一种优越的方法，因为它可以在保持准确性的同时利用低精度的优势。在本文中，我们提出了一种高性能、高度灵活的${W^8A^8}$W8A8（INT8权重和INT8激活）和${W^TA^2}$WTA2（TERNARY权重和INT2激活）混合-精密CNN推理硬件架构DPUmxp，在Xilinx Virtex UltraScale+13P FPGA上设计和实现，峰值性能高达58.9 TOPS。

更新日期：2021-05-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11