Learning a Deep Dual Attention Network for Video Super-Resolution.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning a Deep Dual Attention Network for Video Super-Resolution.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2020-02-12 , DOI: 10.1109/tip.2020.2972118
Feng Li , Huihui Bai , Yao Zhao

Recently, deep learning based video super-resolution (SR) methods combine the convolutional neural networks (CNN) with motion compensation to estimate a high-resolution (HR) video from its low-resolution (LR) counterpart. However, most previous methods conduct downscaling motion estimation to handle large motions, which can lead to detrimental effects on the accuracy of motion estimation due to the reduction of spatial resolution. Besides, these methods usually treat different types of intermediate features equally, which lack flexibility to emphasize meaningful information for revealing the high-frequency details. In this paper, to solve above issues, we propose a deep dual attention network (DDAN), including a motion compensation network (MCNet) and a SR reconstruction network (ReconNet), to fully exploit the spatio-temporal informative features for accurate video SR. The MCNet progressively learns the optical flow representations to synthesize the motion information across adjacent frames in a pyramid fashion. To decrease the mis-registration errors caused by the optical flow based motion compensation, we extract the detail components of original LR neighboring frames as complementary information for accurate feature extraction. In the ReconNet, we implement dual attention mechanisms on a residual unit and form a residual attention unit to focus on the intermediate informative features for high-frequency details recovery. Extensive experimental results on numerous datasets demonstrate the proposed method can effectively achieve superior performance in terms of quantitative and qualitative assessments compared with state-of-the-art methods.

中文翻译：

学习视频超分辨率的深度双重注意力网络。

最近，基于深度学习的视频超分辨率（SR）方法将卷积神经网络（CNN）与运动补偿相结合，从低分辨率（LR）视频中估计出高分辨率（HR）视频。然而，大多数先前的方法进行降尺度运动估计来处理大运动，这可能由于空间分辨率的降低而对运动估计的准确性产生不利影响。此外，这些方法通常平等地对待不同类型的中间特征，缺乏灵活性来强调有意义的信息以揭示高频细节。在本文中，为了解决上述问题，我们提出了一种深度双重注意网络（DDAN），包括运动补偿网络（MCNet）和SR重建网络（ReconNet），以充分利用时空信息特征来实现准确的视频SR 。 MCNet 逐步学习光流表示，以金字塔方式合成相邻帧的运动信息。为了减少基于光流的运动补偿引起的配准错误，我们提取原始LR相邻帧的细节分量作为准确特征提取的补充信息。在ReconNet中，我们在残差单元上实现双重注意机制，并形成残差注意单元来关注中间信息特征以进行高频细节恢复。对大量数据集的广泛实验结果表明，与最先进的方法相比，所提出的方法可以在定量和定性评估方面有效地实现卓越的性能。

更新日期：2020-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11