当前位置: X-MOL 学术J. Electron. Imaging › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Space-time super-resolution with motion-perceptive deformable alignment
Journal of Electronic Imaging ( IF 1.0 ) Pub Date : 2021-06-01 , DOI: 10.1117/1.jei.30.3.033020
Zhuojun Cai 1 , Xiang Tian 1 , Ze Chen 1 , Yaowu Chen 1
Affiliation  

We proposed a motion-perceptive deformable alignment network that introduces a pre-computed optical flow to improve the motion perception of the deformable alignment process. The pre-computed flows share the burden of the learned offsets for motion estimation while the flexibility of the deformable convolutional network is maintained. In addition, a motion-adaptive pyramid structure in which the features are aligned in multi-scale levels and then merged based on the motion strength among input frames is proposed. With the above structures, an innovative space-time super-resolution (STSR) network is constructed with an improved motion compensation ability. STSR aims to restore a high-resolution and high-frame-rate sequence from its corresponding low-resolution and low-frame-rate version. The proposed STSR network is trained with a Vimeo-90K dataset, and tests are conducted on the Vimeo-90K, densely-annotated video segmentation (DAVIS), and realistic and diverse scenes (REDS) datasets. Performance is evaluated using the peak signal-to-noise ratio and structural similarity index of the entire restored frame on the Y channel. Extensive experiments demonstrate the superiority of the proposed network among both the one- and two-stage STSR methods, its improved alignment ability, and its significantly improved interpolated frame synthesis.

中文翻译:

具有运动感知可变形对齐的时空超分辨率

我们提出了一种运动感知可变形对齐网络,该网络引入了预先计算的光流来改善可变形对齐过程的运动感知。预先计算的流分担了运动估计的学习偏移的负担,同时保持了可变形卷积网络的灵活性。此外,提出了一种运动自适应金字塔结构,其中特征在多尺度级别对齐,然后基于输入帧之间的运动强度进行合并。通过上述结构,构建了具有改进运动补偿能力的创新时空超分辨率(STSR)网络。STSR 旨在从其相应的低分辨率和低帧率版本中恢复高分辨率和高帧率序列。建议的 STSR 网络使用 Vimeo-90K 数据集进行训练,并在 Vimeo-90K、密集注释视频分割 (DAVIS) 和逼真多样场景 (REDS) 数据集上进行测试。使用 Y 通道上整个恢复帧的峰值信噪比和结构相似性指数来评估性能。大量实验证明了所提出的网络在一级和两级 STSR 方法中的优越性、改进的对齐能力以及显着改进的内插帧合成。
更新日期:2021-06-08
down
wechat
bug