iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks,Computational Visual Media

当前位置： X-MOL 学术 › Comp. Visual Media › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
Computational Visual Media ( IF 6.9 ) Pub Date : 2020-07-20 , DOI: 10.1007/s41095-020-0175-7
Aman Chadha , John Britto , M. Mani Roja

Recently, learning-based models have enhanced the performance of single-image super-resolution (SISR). However, applying SISR successively to each video frame leads to a lack of temporal coherency. Convolutional neural networks (CNNs) outperform traditional approaches in terms of image quality metrics such as peak signal to noise ratio (PSNR) and structural similarity (SSIM). On the other hand, generative adversarial networks (GANs) offer a competitive advantage by being able to mitigate the issue of a lack of finer texture details, usually seen with CNNs when super-resolving at large upscaling factors. We present iSeeBetter, a novel GAN-based spatio-temporal approach to video super-resolution (VSR) that renders temporally consistent super-resolution videos. iSeeBetter extracts spatial and temporal information from the current and neighboring frames using the concept of recurrent back-projection networks as its generator. Furthermore, to improve the “naturality” of the super-resolved output while eliminating artifacts seen with traditional algorithms, we utilize the discriminator from super-resolution generative adversarial network. Although mean squared error (MSE) as a primary loss-minimization objective improves PSNR/SSIM, these metrics may not capture fine details in the image resulting in misrepresentation of perceptual quality. To address this, we use a four-fold (MSE, perceptual, adversarial, and total-variation loss function. Our results demonstrate that iSeeBetter offers superior VSR fidelity and surpasses state-of-the-art performance.

中文翻译：

iSeeBetter：时空视频超分辨率，使用递归生成式背投影网络

最近，基于学习的模型增强了单图像超分辨率（SISR）的性能。但是，将SISR连续应用于每个视频帧会导致缺乏时间一致性。卷积神经网络（CNN）在图像质量指标（例如峰信噪比（PSNR）和结构相似性（SSIM））方面优于传统方法。另一方面，生成对抗网络（GAN）能够减轻缺乏精细纹理细节的问题，从而提供了竞争优势，通常在大型放大因子下进行超分辨率时，通常会出现CNN所见的细节。我们介绍iSeeBetter，这是一种新颖的基于GAN的时空视频超分辨率（VSR）方法，可呈现时间上一致的超分辨率视频。iSeeBetter使用循环反投影网络作为其生成器，从当前帧和邻近帧中提取空间和时间信息。此外，为了改善超分辨输出的“自然性”，同时消除传统算法中出现的伪像，我们利用了超分辨率生成对抗网络中的鉴别器。尽管均方误差（MSE）作为最小化损耗的主要目标可以改善PSNR / SSIM，但这些指标可能无法捕获图像中的精细细节，从而导致对感知质量的误解。为了解决这个问题，我们使用四重（MSE，感知，对抗和总变化损失函数）。我们的结果表明，iSeeBetter提供了出色的VSR保真度并超过了最新的性能。

更新日期：2020-07-20

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>