当前位置: X-MOL 学术IEEE Trans. Circuits Syst. I Regul. Pap. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation
IEEE Transactions on Circuits and Systems I: Regular Papers ( IF 5.1 ) Pub Date : 2021-02-01 , DOI: 10.1109/tcsi.2020.3038139
Lin Bai , Yecheng Lyu , Xinming Huang

In recent years, convolutional neural network (CNN) has gained popularity in many engineering applications especially for computer vision. In order to achieve better performance, more complex structures and advanced operations are incorporated into neural networks, which results in very long inference time. For time-critical tasks such as autonomous driving and virtual reality, real-time processing is fundamental. In order to reach real-time processing speed, a lightweight, high-throughput CNN architecture namely RoadNet-RT is proposed for road segmentation in this article. It achieves 92.55% MaxF score on KITTI road segmentation dataset. The inference time is about 9 ms per frame when running on GTX 1080 GPU. Comparing to the state-of-the-art network, RoadNet-RT speeds up the inference time by a factor of 17.8 at the cost of only 3.75% loss in accuracy. What is more, on CamVid dataset its accuracy is 92.98%. Several techniques such as depthwise separable convolution and non-uniformed kernel size convolution are optimized in the hardware accelerator design. The proposed CNN architecture has been successfully implemented on a ZCU102 MPSoC FPGA that achieves the computation capability of 331 GOPS using INT8 quantization. The system throughput reaches 196.7 frames per second with input image size of $280\times 960$ . The source code is published at https://github.com/linbaiwpi/RoadNet-RT.

中文翻译:

RoadNet-RT:用于实时道路分割的高吞吐量 CNN 架构和 SoC 设计

近年来,卷积神经网络 (CNN) 在许多工程应用中越来越受欢迎,尤其是在计算机视觉方面。为了获得更好的性能,在神经网络中加入了更复杂的结构和高级操作,这导致了非常长的推理时间。对于自动驾驶和虚拟现实等时间紧迫的任务,实时处理是基础。为了达到实时处理速度,本文提出了一种轻量级、高吞吐量的 CNN 架构,即 RoadNet-RT 用于道路分割。它在 KITTI 道路分割数据集上达到了 92.55% 的 MaxF 分数。在 GTX 1080 GPU 上运行时,推理时间约为每帧 9 毫秒。与最先进的网络相比,RoadNet-RT 将推理时间加快了 17.8 倍,而成本仅为 3。准确率损失 75%。更重要的是,在 CamVid 数据集上它的准确率为 92.98%。在硬件加速器设计中优化了深度可分离卷积和非均匀内核大小卷积等几种技术。所提出的 CNN 架构已在 ZCU102 MPSoC FPGA 上成功实现,该 FPGA 使用 INT8 量化实现了 331 GOPS 的计算能力。系统吞吐量达到每秒 196.7 帧,输入图像大小为 $280\times 960$。源代码发布在 https://github.com/linbaiwpi/RoadNet-RT。所提出的 CNN 架构已在 ZCU102 MPSoC FPGA 上成功实现,该 FPGA 使用 INT8 量化实现了 331 GOPS 的计算能力。系统吞吐量达到每秒 196.7 帧,输入图像大小为 $280\times 960$。源代码发布在 https://github.com/linbaiwpi/RoadNet-RT。所提出的 CNN 架构已在 ZCU102 MPSoC FPGA 上成功实现,该 FPGA 使用 INT8 量化实现了 331 GOPS 的计算能力。系统吞吐量达到每秒 196.7 帧,输入图像大小为 $280\times 960$。源代码发布在 https://github.com/linbaiwpi/RoadNet-RT。
更新日期:2021-02-01
down
wechat
bug