Neural texture transfer assisted video coding with adaptive up-sampling,Signal Processing: Image Communication

当前位置： X-MOL 学术 › Signal Process. Image Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Neural texture transfer assisted video coding with adaptive up-sampling
Signal Processing: Image Communication ( IF 3.4 ) Pub Date : 2022-06-03 , DOI: 10.1016/j.image.2022.116754
Li Yu , Wenshuai Chang , Weize Quan , Jimin Xiao , Dong-Ming Yan , Moncef Gabbouj

Deep learning techniques have been extensively investigated for the purpose of further increasing the efficiency of traditional video compression. Some deep learning techniques for down/up-sampling-based video coding were found to be especially effective when the bandwidth or storage is limited. Existing works mainly differ in the super-resolution models used. Some works simply use a single image super-resolution model, ignoring the rich information in the correlation between video frames, while others explore the correlation between frames by simply concatenating the features across adjacent frames. This, however, may fail when the textures are not well aligned. In this paper, we propose to utilize neural texture transfer which exploits the semantic correlation between frames and is able to explore the correlated information even when the textures are not aligned. Meanwhile, an adaptive group of pictures (GOP) method is proposed to automatically decide whether a frame should be down-sampled or not. Experimental results show that the proposed method outperforms the standard HEVC and state-of-the-art methods under different compression configurations. When compared to standard HEVC, the BD-rate (PSNR) and BD-rate (SSIM) of the proposed method are up to -19.1% and -26.5%, respectively.

中文翻译：

具有自适应上采样的神经纹理转移辅助视频编码

为了进一步提高传统视频压缩的效率，已经对深度学习技术进行了广泛的研究。一些用于基于下/上采样的视频编码的深度学习技术被发现在带宽或存储受限时特别有效。现有作品的主要区别在于使用的超分辨率模型。一些作品简单地使用单个图像超分辨率模型，忽略了视频帧之间相关性中的丰富信息，而另一些作品则通过简单地连接相邻帧之间的特征来探索帧之间的相关性。但是，当纹理没有很好地对齐时，这可能会失败。在本文中，我们建议利用神经纹理转移，它利用帧之间的语义相关性，即使在纹理未对齐的情况下也能够探索相关信息。同时，提出了一种自适应图片组（GOP）方法来自动决定是否应该对帧进行下采样。实验结果表明，所提出的方法在不同的压缩配置下优于标准 HEVC 和最先进的方法。与标准 HEVC 相比，该方法的 BD-rate (PSNR) 和 BD-rate (SSIM) 分别高达 -19.1% 和 -26.5%。实验结果表明，所提出的方法在不同的压缩配置下优于标准 HEVC 和最先进的方法。与标准 HEVC 相比，该方法的 BD-rate (PSNR) 和 BD-rate (SSIM) 分别高达 -19.1% 和 -26.5%。实验结果表明，所提出的方法在不同的压缩配置下优于标准 HEVC 和最先进的方法。与标准 HEVC 相比，该方法的 BD-rate (PSNR) 和 BD-rate (SSIM) 分别高达 -19.1% 和 -26.5%。

更新日期：2022-06-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文