当前位置: X-MOL 学术IEEE Trans. Broadcast. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cross-Frame Transformer-Based Spatio-Temporal Video Super-Resolution
IEEE Transactions on Broadcasting ( IF 3.2 ) Pub Date : 2022-02-07 , DOI: 10.1109/tbc.2022.3147145
Wenhui Zhang 1 , Mingliang Zhou 2 , Cheng Ji 3 , Xiubao Sui 1 , Junqi Bai 4
Affiliation  

In this paper, we explore the spatio-temporal video super-resolution task, which aims to generate a high-resolution and high-frame-rate video from an existing video with low resolution and frame rate. First, we propose an end-to-end spatio-temporal video super-resolution network chiefly composed of cross-frame transformers instead of traditional convolutions. Especially, the cross-frame transformer module divides the input feature sequence into query, key, value matrixes, and then obtains the maximum similarity and similarity coefficient matrixes between neighboring and current feature maps through self-attention processing operations. Next, we propose a multi-level residual reconstruction module, which could make full use of the maximum similarity and similarity coefficient matrixes obtained by the cross-frame transformer, to reconstruct the high resolution and frame rate results from coarse to fine. Qualitative and quantitative evaluation results show that our method offers better performance and requires fewer training parameters compared with the existing two-stage network.

中文翻译:


基于跨帧变换器的时空视频超分辨率



在本文中,我们探索了时空视频超分辨率任务,其目的是从低分辨率和帧率的现有视频生成高分辨率和高帧率的视频。首先,我们提出了一种端到端时空视频超分辨率网络,主要由跨帧变换器而不是传统的卷积组成。特别是,跨帧变换器模块将输入特征序列划分为查询、键、值矩阵,然后通过自注意力处理操作获得相邻特征图和当前特征图之间的最大相似度和相似度系数矩阵。接下来,我们提出了一种多级残差重建模块,该模块可以充分利用跨帧变换器获得的最大相似度和相似系数矩阵,从粗到细地重建高分辨率和帧率结果。定性和定量评估结果表明,与现有的两阶段网络相比,我们的方法具有更好的性能并且需要更少的训练参数。
更新日期:2022-02-07
down
wechat
bug