当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Utilising Low Complexity CNNs to Lift Non-Local Redundancies in Video Coding.
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2020-05-06 , DOI: 10.1109/tip.2020.2991525
Jan P. Klopp , Liang-Gee Chen , Shao-Yi Chien

Digital media is ubiquitous and produced in ever-growing quantities. This necessitates a constant evolution of compression techniques, especially for video, in order to maintain efficient storage and transmission. In this work, we aim at exploiting non-local redundancies in video data that remain difficult to erase for conventional video codecs. We design convolutional neural networks with a particular emphasis on low memory and computational footprint. The parameters of those networks are trained on the fly, at encoding time, to predict the residual signal from the decoded video signal. After the training process has converged, the parameters are compressed and signalled as part of the code of the underlying video codec. The method can be applied to any existing video codec to increase coding gains while its low computational footprint allows for an application under resource-constrained conditions. Building on top of High Efficiency Video Coding, we achieve coding gains similar to those of pretrained denoising CNNs while only requiring about 1% of their computational complexity. Through extensive experiments, we provide insights into the effectiveness of our network design decisions. In addition, we demonstrate that our algorithm delivers stable performance under conditions met in practical video compression: our algorithm performs without significant performance loss on very long random access segments (up to 256 frames) and with moderate performance drops can even be applied to single frames in high-resolution low delay settings.

中文翻译:

利用低复杂度的CNN提升视频编码中的非本地冗余。

数字媒体无处不在,并且数量不断增长。为了维持有效的存储和传输,压缩技术尤其是视频压缩技术需要不断发展。在这项工作中,我们旨在利用视频数据中的非本地冗余,这些冗余对于常规视频编解码器而言仍然很难擦除。我们设计卷积神经网络,特别强调低内存和计算足迹。这些网络的参数在编码时进行动态训练,以根据解码后的视频信号预测残留信号。训练过程收敛后,参数将被压缩并作为基础视频编解码器代码的一部分发出信号。该方法可以应用于任何现有的视频编解码器,以增加编码增益,而其低计算量却允许在资源受限的条件下进行应用。在高效视频编码的基础上,我们获得的编码增益与预训练的去噪CNN相似,而只需要其计算复杂度的1%。通过广泛的实验,我们可以洞悉网络设计决策的有效性。此外,我们证明了我们的算法在实际视频压缩条件下可以提供稳定的性能:我们的算法在很长的随机访问段(最多256帧)上的性能没有明显损失,并且性能下降适中甚至可以应用于单帧在高分辨率的低延迟设置中。
更新日期:2020-05-06
down
wechat
bug