当前位置: X-MOL 学术Wirel. Commun. Mob. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An FPGA-Based Convolutional Neural Network Coprocessor
Wireless Communications and Mobile Computing ( IF 2.146 ) Pub Date : 2021-06-14 , DOI: 10.1155/2021/3768724
Changpei Qiu 1 , Xin’an Wang 1 , Tianxia Zhao 1 , Qiuping Li 1 , Bo Wang 1 , Hu Wang 1
Affiliation  

In this paper, an FPGA-based convolutional neural network coprocessor is proposed. The coprocessor has a 1D convolutional computation unit PE in row stationary (RS) streaming mode and a 3D convolutional computation unit PE chain in pulsating array structure. The coprocessor can flexibly control the number of PE array openings according to the number of output channels of the convolutional layer. In this paper, we design a storage system with multilevel cache, and the global cache uses multiple broadcasts to distribute data to local caches and propose an image segmentation method that is compatible with the hardware architecture. The proposed coprocessor implements the convolutional and pooling layers of the VGG16 neural network model, in which the activation value, weight value, and bias value are quantized using 16-bit fixed-point quantization, with a peak computational performance of 316.0 GOP/s and an average computational performance of 62.54 GOP/s at a clock frequency of 200 MHz and a power consumption of about 9.25 W.

中文翻译:

基于 FPGA 的卷积神经网络协处理器

本文提出了一种基于FPGA的卷积神经网络协处理器。协处理器具有行静止(RS)流模式的一维卷积计算单元PE和脉动阵列结构的3D卷积计算单元PE链。协处理器可以根据卷积层的输出通道数灵活控制PE阵列开口的数量。在本文中,我们设计了一个具有多级缓存的存储系统,全局缓存使用多个广播将数据分发到本地缓存,并提出了一种与硬件架构兼容的图像分割方法。所提出的协处理器实现了 VGG16 神经网络模型的卷积层和池化层,其中激活值、权重值和偏置值使用 16 位定点量化进行量化,
更新日期:2021-06-14
down
wechat
bug