Biprediction-Based Video Quality Enhancement via Learning,IEEE Transactions on Cybernetics

当前位置： X-MOL 学术 › IEEE Trans. Cybern. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Biprediction-Based Video Quality Enhancement via Learning
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 6-17-2020 , DOI: 10.1109/tcyb.2020.2998481
Dandan Ding ₁ , Wenyu Wang ₁ , Junchao Tong ₁ , Xinbo Gao ₂ , Zoe Liu ₃ , Yong Fang ₄

Affiliation

Convolutional neural networks (CNNs)-based video quality enhancement generally employs optical flow for pixelwise motion estimation and compensation, followed by utilizing motion-compensated frames and jointly exploring the spatiotemporal correlation across frames to facilitate the enhancement. This method, called the optical-flow-based method (OPT), usually achieves high accuracy at the expense of high computational complexity. In this article, we develop a new framework, referred to as biprediction-based multiframe video enhancement (PMVE), to achieve a one-pass enhancement procedure. PMVE designs two networks, that is, the prediction network (Pred-net) and the frame-fusion network (FF-net), to implement the two steps of synthesization and fusion, respectively. Specifically, the Pred-net leverages frame pairs to synthesize the so-called virtual frames (VFs) for those low-quality frames (LFs) through biprediction. Afterward, the slowly fused FF-net takes the VFs as the input to extract the correlation across the VFs and the related LFs, to obtain an enhanced version of those LFs. Such a framework allows PMVE to leverage the cross-correlation between successive frames for enhancement, hence capable of achieving high accuracy performance. Meanwhile, PMVE effectively avoids the explicit operations of motion estimation and compensation, hence greatly reducing the complexity compared to OPT. The experimental results demonstrate that the peak signal-to-noise ratio (PSNR) performance of PMVE is fully on par with that of OPT while its computational complexity is only 1% of OPT. Compared with other state-of-the-art methods in the literature, PMVE is also confirmed to achieve superior performance in both objective quality and visual quality at a reasonable complexity level. For instance, PMVE can surpass its best counterpart method by up to 0.42 dB in PSNR.

中文翻译：

通过学习增强基于双向预测的视频质量

基于卷积神经网络（CNN）的视频质量增强通常采用光流进行像素级运动估计和补偿，然后利用运动补偿帧并共同探索帧间的时空相关性以促进增强。这种方法称为基于光流的方法（OPT），通常以高计算复杂度为代价来实现高精度。在本文中，我们开发了一种新框架，称为基于双向预测的多帧视频增强（PMVE），以实现一次性增强过程。 PMVE设计了两个网络，即预测网络（Pred-net）和帧融合网络（FF-net），分别实现合成和融合两个步骤。具体来说，Pred-net 利用帧对通过双向预测为那些低质量帧 (LF) 合成所谓的虚拟帧 (VF)。然后，缓慢融合的 FF-net 以 VF 作为输入，提取 VF 和相关 LF 之间的相关性，以获得这些 LF 的增强版本。这样的框架允许 PMVE 利用连续帧之间的互相关性进行增强，从而能够实现高精度性能。同时，PMVE有效地避免了运动估计和补偿的显式操作，因此与OPT相比大大降低了复杂度。实验结果表明，PMVE的峰值信噪比（PSNR）性能与OPT完全相当，而其计算复杂度仅为OPT的1%。与文献中其他最先进的方法相比，PMVE 也被证实在合理的复杂度水平下在客观质量和视觉质量方面都取得了优异的性能。例如，PMVE 在 PSNR 方面可以超越其最佳对应方法高达 0.42 dB。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11