当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Zooming SlowMo: An Efficient One-Stage Framework for Space-Time Video Super-Resolution
arXiv - CS - Multimedia Pub Date : 2021-04-15 , DOI: arxiv-2104.07473
Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach, Chenliang Xu

In this paper, we address the space-time video super-resolution, which aims at generating a high-resolution (HR) slow-motion video from a low-resolution (LR) and low frame rate (LFR) video sequence. A na\"ive method is to decompose it into two sub-tasks: video frame interpolation (VFI) and video super-resolution (VSR). Nevertheless, temporal interpolation and spatial upscaling are intra-related in this problem. Two-stage approaches cannot fully make use of this natural property. Besides, state-of-the-art VFI or VSR deep networks usually have a large frame reconstruction module in order to obtain high-quality photo-realistic video frames, which makes the two-stage approaches have large models and thus be relatively time-consuming. To overcome the issues, we present a one-stage space-time video super-resolution framework, which can directly reconstruct an HR slow-motion video sequence from an input LR and LFR video. Instead of reconstructing missing LR intermediate frames as VFI models do, we temporally interpolate LR frame features of the missing LR frames capturing local temporal contexts by a feature temporal interpolation module. Extensive experiments on widely used benchmarks demonstrate that the proposed framework not only achieves better qualitative and quantitative performance on both clean and noisy LR frames but also is several times faster than recent state-of-the-art two-stage networks. The source code is released in https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020 .

中文翻译:

Zooming SlowMo:时空视频超分辨率的高效单阶段框架

在本文中,我们解决了时空视频超分辨率问题,该目标旨在从低分辨率(LR)和低帧率(LFR)视频序列生成高分辨率(HR)慢动作视频。一种简单的方法是将其分解为两个子任务:视频帧插值(VFI)和视频超分辨率(VSR)。尽管如此,时间插值和空间放大在此问题内是相关的。两阶段方法此外,最先进的VFI或VSR深度网络通常具有大的帧重构模块,以便获得高质量的逼真的视频帧,这使得采用两步法具有较大的模型,因此比较耗时。为解决这些问题,我们提出了一个阶段的时空视频超分辨率框架,它可以从输入的LR和LFR视频直接重建HR慢动作视频序列。代替像VFI模型那样重建丢失的LR中间帧,我们通过特征时间插值模块在时间上对丢失的LR帧的LR帧特征进行插值,以捕获局部时间上下文。在广泛使用的基准上进行的大量实验表明,所提出的框架不仅在干净和嘈杂的LR帧上均实现了更好的定性和定量性能,而且比最近的最新两级网络快了几倍。源代码在https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020中发布。我们通过特征时间插值模块对丢失的LR帧的LR帧特征进行时间插值,以捕获局部时间上下文。在广泛使用的基准上进行的大量实验表明,所提出的框架不仅在干净和嘈杂的LR帧上均实现了更好的定性和定量性能,而且比最近的最新两级网络快了几倍。源代码在https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020中发布。我们通过特征时间插值模块对丢失的LR帧的LR帧特征进行时间插值,以捕获局部时间上下文。在广泛使用的基准上进行的大量实验表明,所提出的框架不仅在干净和嘈杂的LR帧上均实现了更好的定性和定量性能,而且比最近的最新两级网络快了几倍。源代码在https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020中发布。在广泛使用的基准上进行的大量实验表明,所提出的框架不仅在干净和嘈杂的LR帧上均实现了更好的定性和定量性能,而且比最近的最新两级网络快了几倍。源代码在https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020中发布。在广泛使用的基准上进行的大量实验表明,所提出的框架不仅在干净和嘈杂的LR帧上均实现了更好的定性和定量性能,而且比最近的最新两级网络快了几倍。源代码在https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020中发布。
更新日期:2021-04-16
down
wechat
bug