当前位置: X-MOL 学术Microprocess. Microsyst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TileNET: Hardware accelerator for ternary Convolutional Neural Networks
Microprocessors and Microsystems ( IF 1.9 ) Pub Date : 2021-01-27 , DOI: 10.1016/j.micpro.2021.104039
Sagar Eetha , Sruthi P.K. , Vibha Pant , Sai Vikram , Mihir Mody , Madhura Purnaprajna

Convolutional Neural Networks (CNNs) are popular in Advanced Driver Assistance Systems (ADAS) for camera perception. The versatility of the algorithm makes it applicable in multiple applications like object detection, lane detection and semantic segmentation. For image processing to be viable in driver assistance systems, the throughput requirement ranges in the order of a few tens of TeraMACs per second (TMACs). In addition, high accuracy levels of image detection and recognition cannot be compromised for the need for high throughput.

In this paper, we present TileNET, a novel tiled architecture for ternary-weighted CNNs. TileNET is modular and scalable across variations in network organization and device configurations. Two modes of the implementation are presented, viz., systolic and streaming. A high-level estimation technique has been developed that facilitates fast performance evaluation through design space exploration among a range of target devices and varying CNN models.

Performance has been verified for area and throughput estimation for Xilinx Virtex, Artix, Kintex and Zynq devices. TileNET implemented on Virtex-7 (XC7VX1140T) results in a throughput of about 16 Tera-operations per second (TOPs) for LeNet, AlexNet, ResNet-50 and VGG-16. In addition, the 45nm standard cell implementation of TileNet shows a throughput of about 30 TOPs respectively.



中文翻译:

TileNET:三元卷积神经网络的硬件加速器

卷积神经网络(CNN)在高级驾驶员辅助系统(ADAS)中很流行,可以用于摄像头感知。该算法的多功能性使其可应用于多种应用,例如对象检测,车道检测和语义分割。为了使图像处理在驾驶员辅助系统中可行,吞吐量要求的范围为每秒几十个TeraMAC(TMAC)。另外,图像检测和识别的高精度水平不能因为需要高吞吐量而受到损害。

在本文中,我们介绍了TileNET,这是一种用于三元加权CNN的新颖平铺架构。TileNET是模块化的,可跨网络组织和设备配置的变化进行扩展。介绍了两种实现方式,即收缩压和流式传输。已经开发出一种高级估计技术,该技术可以通过在一系列目标设备和不同的CNN模型之间进行设计空间探索来促进快速性能评估。

Xilinx Virtex,Artix,Kintex和Zynq器件的面积和吞吐量估计已通过性能验证。在Virtex-7(XC7VX1140T)上实现的TileNET可以为LeNet,AlexNet,ResNet-50和VGG-16带来每秒约16 Tera-operations(TOP)的吞吐量。此外,TileNet的45nm标准单元实现分别显示约30个TOP的吞吐量。

更新日期:2021-02-07
down
wechat
bug