A Unified Hardware Architecture for Convolutions and Deconvolutions in CNN,arXiv - CS - Hardware Architecture

当前位置： X-MOL 学术 › arXiv.cs.AR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Unified Hardware Architecture for Convolutions and Deconvolutions in CNN
arXiv - CS - Hardware Architecture Pub Date : 2020-05-29 , DOI: arxiv-2006.00053
Lin Bai, Yecheng Lyu and Xinming Huang

In this paper, a scalable neural network hardware architecture for image segmentation is proposed. By sharing the same computing resources, both convolution and deconvolution operations are handled by the same process element array. In addition, access to on-chip and off-chip memories is optimized to alleviate the burden introduced by partial sum. As an example, SegNet-Basic has been implemented using the proposed unified architecture by targeting on Xilinx ZC706 FPGA, which achieves the performance of 151.5 GOPS and 94.3 GOPS for convolution and deconvolution respectively. This unified convolution/deconvolution design is applicable to other CNNs with deconvolution.

中文翻译：

CNN 中卷积和反卷积的统一硬件架构

在本文中，提出了一种用于图像分割的可扩展神经网络硬件架构。通过共享相同的计算资源，卷积和反卷积操作都由相同的进程元素阵列处理。此外，对片上和片外存储器的访问进行了优化，以减轻部分求和带来的负担。例如，SegNet-Basic 已通过针对 Xilinx ZC706 FPGA 使用所提出的统一架构实现，其卷积和解卷积的性能分别达到 151.5 GOPS 和 94.3 GOPS。这种统一的卷积/反卷积设计适用于其他具有反卷积的 CNN。

更新日期：2020-06-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文