当前位置: X-MOL 学术J. Intell. Transp. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A deep learning approach for quality enhancement of surveillance video
Journal of Intelligent Transportation Systems ( IF 2.8 ) Pub Date : 2019-10-21 , DOI: 10.1080/15472450.2019.1670659
Dandan Ding 1 , Junchao Tong 1 , Lingyi Kong 1
Affiliation  

Abstract The growing number of surveillance cameras imposes great demand on high efficiency video coding. Although modern video coding standards have significantly improved the video coding efficiency, they are designed for general video rather than surveillance video. The special characteristics of surveillance video leave a large space for further performance improvement. In this paper, we leverage a deep learning approach to enhance the quality of compressed surveillance video. More specifically, we formulate the problem of frame enhancement as a regression problem and design a Residual Squeeze-and-Excitation Network (RSE-Net), to address it. RSE-Net extensively exploits the non-linear mapping between the reconstructed frame and the ground truth, with only a small number of parameters. Moreover, By improving You Only Look Once (YOLO) network, we successfully detect the grouped vehicles within a frame. A novel model training scheme is then developed through learning from the grouped vehicles. With the proposed scheme, we train a global model for both foreground and background of surveillance video. Experimental results show that our method achieves average 0.40 dB, 0.22 dB and 0.24 dB PSNR gains over H.265/HEVC anchor in AI, LDP and RA configurations, and produces visually pleasing results when applied to compressed surveillance video.

中文翻译:

一种监控视频质量提升的深度学习方法

摘要 随着监控摄像机数量的不断增加,对高效视频编码提出了很高的要求。虽然现代视频编码标准已经显着提高了视频编码效率,但它们是为一般视频而不是监控视频而设计的。监控视频的特殊性为进一步的性能提升留下了很大的空间。在本文中,我们利用深度学习方法来提高压缩监控视频的质量。更具体地说,我们将帧增强问题表述为回归问题,并设计了一个残余挤压和激励网络 (RSE-Net) 来解决它。RSE-Net 广泛地利用了重建帧和地面实况之间的非线性映射,只有少量参数。此外,通过改进你只看一次(YOLO)网络,我们成功地检测到一个帧内的分组车辆。然后通过从分组车辆中学习来开发一种新的模型训练方案。通过所提出的方案,我们为监控视频的前景和背景训练了一个全局模型。实验结果表明,我们的方法在 AI、LDP 和 RA 配置中比 H.265/HEVC 锚点实现了平均 0.40 dB、0.22 dB 和 0.24 dB PSNR 增益,并在应用于压缩监控视频时产生了视觉上令人愉悦的结果。
更新日期:2019-10-21
down
wechat
bug