当前位置: X-MOL 学术IEEE Trans. Broadcast. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Global Appearance and Local Coding Distortion Based Fusion Framework for CNN Based Filtering in Video Coding
IEEE Transactions on Broadcasting ( IF 4.5 ) Pub Date : 2022-03-03 , DOI: 10.1109/tbc.2022.3152064
Jian Yue 1 , Yanbo Gao 2 , Shuai Li 1 , Hui Yuan 1 , Frederic Dufaux 3
Affiliation  

In-loop filtering is used in video coding to process the reconstructed frame in order to remove blocking artifacts. With the development of convolutional neural networks (CNNs), CNNs have been explored for in-loop filtering considering it can be treated as an image de-noising task. However, in addition to being a distorted image, the reconstructed frame is also obtained by a fixed line of block based encoding operations in video coding. It carries coding-unit based coding distortion of some similar characteristics. Therefore, in this paper, we address the filtering problem from two aspects, (i) global appearance restoration for disrupted texture and (ii) local coding distortion restoration caused by fixed pipeline of coding. Accordingly, a three-stream global appearance and local coding distortion based fusion network is developed with a high-level global feature stream, a high-level local feature stream and a low-level local feature stream. Ablation study is conducted to validate the necessity of different features, demonstrating that the global features and local features can complement each other in filtering and achieve better performance when combined. To the best of our knowledge, we are the first one that clearly characterizes the video filtering process from the above global appearance and local coding distortion restoration aspects with experimental verification, providing a clear pathway to developing filter techniques. Experimental results demonstrate that the proposed method significantly outperforms the existing single-frame based methods and achieves 13.5%, 11.3%, 11.7% BD-Rate saving on average for AI, LDP and RA configurations, respectively, compared with the HEVC reference software.

中文翻译:

一种基于全局外观和局部编码失真的融合框架,用于视频编码中基于 CNN 的过滤

在视频编码中使用环路滤波来处理重建的帧,以去除块状伪影。随着卷积神经网络 (CNN) 的发展,考虑到它可以被视为图像去噪任务,已经探索了 CNN 用于环路滤波。然而,除了是失真图像之外,重建的帧也是通过视频编码中基于块的固定编码操作获得的。它带有一些类似特征的基于编码单元的编码失真。因此,在本文中,我们从两个方面解决过滤问题,(i)破坏纹理的全局外观恢复和(ii)由固定编码管道引起的局部编码失真恢复。因此,开发了一个基于三流全局外观和局部编码失真的融合网络,具有高级全局特征流、高级局部特征流和低级局部特征流。进行消融研究以验证不同特征的必要性,证明全局特征和局部特征可以在过滤中相互补充并在组合时获得更好的性能。据我们所知,我们是第一个通过实验验证从上述全局外观和局部编码失真恢复方面清楚地描述视频过滤过程的人,为开发过滤技术提供了清晰的途径。实验结果表明,所提出的方法明显优于现有的基于单帧的方法,分别达到 13.5%、11.3%、
更新日期:2022-03-03
down
wechat
bug