当前位置: X-MOL 学术IEEE Trans. Broadcast. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Global Appearance and Local Coding Distortion Based Fusion Framework for CNN Based Filtering in Video Coding
IEEE Transactions on Broadcasting ( IF 3.2 ) Pub Date : 2022-03-03 , DOI: 10.1109/tbc.2022.3152064
Jian Yue 1 , Yanbo Gao 2 , Shuai Li 1 , Hui Yuan 1 , Frederic Dufaux 3
Affiliation  

In-loop filtering is used in video coding to process the reconstructed frame in order to remove blocking artifacts. With the development of convolutional neural networks (CNNs), CNNs have been explored for in-loop filtering considering it can be treated as an image de-noising task. However, in addition to being a distorted image, the reconstructed frame is also obtained by a fixed line of block based encoding operations in video coding. It carries coding-unit based coding distortion of some similar characteristics. Therefore, in this paper, we address the filtering problem from two aspects, (i) global appearance restoration for disrupted texture and (ii) local coding distortion restoration caused by fixed pipeline of coding. Accordingly, a three-stream global appearance and local coding distortion based fusion network is developed with a high-level global feature stream, a high-level local feature stream and a low-level local feature stream. Ablation study is conducted to validate the necessity of different features, demonstrating that the global features and local features can complement each other in filtering and achieve better performance when combined. To the best of our knowledge, we are the first one that clearly characterizes the video filtering process from the above global appearance and local coding distortion restoration aspects with experimental verification, providing a clear pathway to developing filter techniques. Experimental results demonstrate that the proposed method significantly outperforms the existing single-frame based methods and achieves 13.5%, 11.3%, 11.7% BD-Rate saving on average for AI, LDP and RA configurations, respectively, compared with the HEVC reference software.

中文翻译:


基于全局外观和局部编码失真的视频编码中基于 CNN 过滤的融合框架



视频编码中使用环路滤波来处理重建帧,以消除块效应。随着卷积神经网络(CNN)的发展,考虑到可以将 CNN 视为图像去噪任务,因此人们开始探索将 CNN 用于环路滤波。然而,重建帧除了是失真图像之外,也是通过视频编码中固定行的基于块的编码操作获得的。它带有一些类似特征的基于编码单元的编码失真。因此,在本文中,我们从两个方面解决过滤问题,(i)破坏纹理的全局外观恢复和(ii)固定编码管道引起的局部编码失真恢复。因此,开发了一种基于三流全局外观和局部编码失真的融合网络,具有高级全局特征流、高级局部特征流和低级局部特征流。进行消融研究来验证不同特征的必要性,证明全局特征和局部特征可以在过滤中相互补充,并在组合时获得更好的性能。据我们所知,我们是第一个通过实验验证从上述全局外观和局部编码失真恢复方面清楚地表征视频滤波过程的人,为开发滤波技术提供了明确的途径。实验结果表明,与 HEVC 参考软件相比,所提出的方法明显优于现有的基于单帧的方法,并且在 AI、LDP 和 RA 配置方面平均分别实现了 13.5%、11.3%、11.7% 的 BD-Rate 节省。
更新日期:2022-03-03
down
wechat
bug