Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network

Song, Liyao; Wang, Quan; Li, Haiwei; Fan, Jiancun; Hu, Bingliang

doi:10.1007/s11063-021-10520-y

Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network

Published: 24 April 2021

Volume 53, pages 2701–2714, (2021)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Liyao Song ORCID: orcid.org/0000-0002-1159-2270¹,
Quan Wang²,
Haiwei Li²,
Jiancun Fan³ &
…
Bingliang Hu²

369 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Video-deblurring has achieved excellent results by using deep learning approaches. How to capture the dynamic spatio-temporal information in the videos is crucial on deblurring. In this paper, we propose a two-stream DeblurGAN which combines a 3D stream with a 2D stream to deblur. The 3D convolution provides spatial and temporal invariance to restore the foreground of frames, while the 2D convolution is sufficient to deal with spatial features, given a relatively consistant background. Thus, our model takes advantage of the great processing power of the 3D stream to handle the foreground which usually contains more dynamical motion blur, and the advantage of the simplicity of the 2D stream to handle the mostly consistent background. We have the full advantage of combining both the 3D convolution and the 2D convolution. Then we take the two-stream model as the generator and adopt the adversarial learning. We test our model on the VideoDeblurring and GOPRO datasets, and compare with other methods which we have listed. Our method outperforms others in the Peak Signal-to-Noise Ratio (PSNR), especially shows a good performance handling the foreground with obvious motion blur.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Video Deblurring Guided by Motion Magnitude

Blind motion deblurring with cycle generative adversarial networks

Article 04 October 2019

Spatio-Temporal Deformable Attention Network for Video Deblurring

References

Bahat Y, Efrat N, Irani M (2017) Non-uniform blind deblurring by reblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:3306–3314
Google Scholar
Cho S, Wang J (2012) Lee S (2012) Video deblurring for hand-held cameras using patch-based synthesis. ACM Trans on Graph (TOG) 31(4):64
Article Google Scholar
Dong J, Pan J, Su Z et al (2017) Blind image deblurring with outlier handling. IEEE Comput Vis Pattern Recognit (CVPR) 2017:2497–2505
Google Scholar
Fu S, Liu W, Tao D et al (2019) HesGCN: hessian graph convolutional networks for semi-supervised classification. Inform Sci 2019:514
MATH Google Scholar
Fu S, Liu W, Zhang K, et al (2021) Semi-supervised classification by graph p-laplacian convolutional networks. Inform Sci, 2021
Gong D, Tan M, Zhang Y et al (2017) Self-paced kernel estimation for robust blind image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1670–1679
Google Scholar
Gong D, Yang J, Liu L, et al (2017) From motion blur to motion flow: a deep learning solution for removing heterogeneous motion blur. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial networks. Advances Neural Infor Proce Sys (NeurIPS) 2014:2672–2680
Google Scholar
Gupta A, Joshi N, Zitnick CL et al (2010) Single image deblurring using motion density functions. Euro Conf Comput Vis (ECCV) 2010:171–184
Google Scholar
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: IEEE Comput Vis Pattern Recognit (CVPR), 2016
Hirsch M, Schuler CJ, Harmeling S et al (2011) Fast removal of non-uniform camera shake. IEEE Comput Vis Pattern Recognit (CVPR) 2011:463–470
Google Scholar
Hirsch M, Schuler C J, Harmeling S, et al (2011) Fast removal of non-uniform camera shake. In: IEEE Inter Conf Comput Vis (ICCV), 2011
Hongguang Z, Yuchao D, Hongdong L, Piotr K (2019) Deep stacked hierarchical multi-patch network for image deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), June 2019
Ji S, Xu W, Yang M (2013) Yu K (2013) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Analys Machine Intell 35(1):221–231
Article Google Scholar
Jin H, Favaro P, Cipolla R (2005) Visual tracking in the presence of motion blur. Proce IEEE Comput Soci Conf Comput Vis Pattern Recognit 2005:18–25
Google Scholar
Jin M, Roth S, Favaro P (2017) Noise-blind image deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Krishnan D, Tay T, Fergus R (2011) Blind deconvolution using a normalized sparsity measure. IEEE Comput Vis Pattern Recognit (CVPR) 2011:233–240
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Inter Conf Neural Infor Proce Sys 2012:1097–1105
Google Scholar
Kupyn O, Martyniuk T, Wu J et al (2019) DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. IEEE Comput Vis Pattern Recognit (CVPR) 2019:8878–8887
Google Scholar
Ledig C, Theis L, Huszar F, et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Lee H S, Lee K M (2013) Dense 3d reconstruction from severely blurred images using a single moving camera. In: IEEE Comput Vis Pattern Recognit (CVPR), 2013
Levin A, Weiss Y, Durand F et al (2009) Understanding and evaluating blind deconvolution algorithms. IEEE Comput Vis Pattern Recognit (CVPR) 2009:1964–1971
Google Scholar
Liu W, Fu S, Zhou Y et al (2020) (2020) Human activity recognition by manifold regularization based dynamic graph convolutional networks. Neurocomputing
Mahesh MM, Rajagopalan AN (2017) Going unconstrained with rolling shutter deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:4030–4038
Google Scholar
Marquina A, Osher S (2000) Explicit algorithms for a new time dependent model based on level set motion for nonlinear deblurring and noise removal. SIAM J Scient Comput 2000:387–405
Article MathSciNet Google Scholar
Orest K, Volodymyr B, Mykola M et al (2018) (2018) Deblurgan: blind motion deblurring using conditional adversarial networks. IEEE Comput Vis Pattern Recognit (CVPR) 2018:8183–8192
Google Scholar
Pan J, Hu Z, Su Z et al (2014) Deblurring text images via L0-regularized intensity and gradient prior. Comput Vision Pattern Recognit 2014:2901–2908
Google Scholar
Pan J, Dong J, Tai YW et al (2017) Learning discriminative data fitting functions for blind image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1077–1085
Google Scholar
Park SH, Levoy M (2014) Gyro-based multi-image deconvolution for removing handshake blur. IEEE Comput Vis Pattern Recognit (CVPR) 2014:3366–3373
Google Scholar
Ren W, Pan J, Cao X et al (2017) Video-deblurring via semantic segmentation and pixel-wise non-linear kernel. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1086–1094
Google Scholar
Ren S, He K, Girshick R, et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Adv Neural Infor Proce Sys (NeurIPS), 2015
Sellent A, Rother C, Roth S (2016) Stereo video deblurring. In: Euro Conf Comput Vis (ECCV), 2016
Seungjun N, Sanghyun S, Kyoung ML (2019) Recurrent neural networks with intra-frame iterations for video deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), June 2019
Shi H, Zhang Y, Zhang Z, et al (2018) Hypergraph-induced convolutional networks for visual classification. IEEE Trans Neur Networks Learn Sys, 2018, pp:1-10
Simonyan K, Zisserman A (2014) Two-Stream Convolutional Networks for Action Recognition in Videos. In: Advances Neural Infor Proce Sys (NeurIPS), 2014
Srinivasan P P, Ng R, Ramamoorthi R (2017) Light field blind motion deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Su S, Delbracio M, Wang J et al (2017) Deep video-deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1279–1288
Google Scholar
Sun L, Cho S, Wang J et al (2013) Edge-based blur kernel estimation using patch priors. IEEE Inter Conf Comput Photography 2013:1–8
Google Scholar
Sun J, Cao W, Xu Z et al (2015) Learning a convolutional neural network for non-uniform motion blur removal. IEEE Comput Vis Pattern Recognit (CVPR) 2015:769–777
Google Scholar
Tai YW, Du H, Brown MS, et al (2008) Image/video-deblurring using a hybrid camera. In: IEEE Comput Vis Pattern Recognit (CVPR), 2008
Vondrick C, Pirsiavash H, Torralba A (2016) Generating Videos with Scene Dynamics. In: Advances Neural Infor Proce Sys (NeurIPS), 2016
Wang T, Zhang X, Jiang R (2020) Video deblurring via spatiotemporal pyramid network and adversarial gradient prior. Comput Vis Image Understand 203:103135
Article Google Scholar
Wu L, Wang Y, Shao L (2018) Cycle-consistent deep generative hashing for cross-modal retrieval. IEEE Trans Image Process, 2018, pp (99):1-1
Xu L, Zheng S, Jia J (2013) Unnatural L0 sparse representation for natural image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2013:1107–1114
Google Scholar
Xu L, Ren J, Liu C, Jia J (2014) Deep convolutional neural network for image deconvolution. In: Advances Neural Infor Proce Sys (NeurIPS), pp 1790–1798
Yan Y, Ren W, Guo Y et al (2017) Image deblurring via extreme channels prior. IEEE Comput Vis Pattern Recognit (CVPR) 2017:6978–6986
Google Scholar
Yang F, Xiao L, Yang J (2020) Video deblurring Via 3d CNN and fourier accumulation learning. IEEE Inter Conf Acoust Speech Sigl Process (ICASSP). 2020:2443–2447
Google Scholar
Yu Z, Xu D, Yu J (2019) Activitynet-QA: a dataset for understanding complex web videos via question answering. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 2019(33):9127–9134
Article Google Scholar
Yu J, Li J, Yu Z (2019) Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans Circuits and Sys for Video Tech, 2019
Yu T, Yu J, Yu Z (2020) Long-term video question answering via multimodal hierarchical memory attentive networks. IEEE Trans Circuits and Sys for Video Tech, 2020, PP(99):1-1
Zhang K, Luo W, Zhong Y et al (2018) (2018) Adversarial spatio-temporal learning for video-deblurring. IEEE Trans Image Process 28(1):291–301
Article Google Scholar
Zhang K, Luo W, Zhong Y (2020) Deblurring by realistic blurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2020
Zhou Y, Sun X, Zha ZJ et al (2018) Mict: Mixed 3d/2d convolutional tube for human action recognition. IEEE Comput Vis Pattern Recognit (CVPR) 2018:449–458
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communications Engineering, Xi’an Jiaotong University, Xi’an, China
Liyao Song
Key Laboratory of Spectral Imaging Technology, Xi’an Institute of Optics and Precision Mechanics of Chinese Academy of Sciences, Xi’an, China
Quan Wang, Haiwei Li & Bingliang Hu
The School of Information and Communications Engineering, Xi’an Jiaotong University, Xi’an, China
Jiancun Fan

Authors

Liyao Song
View author publications
You can also search for this author in PubMed Google Scholar
Quan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haiwei Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiancun Fan
View author publications
You can also search for this author in PubMed Google Scholar
Bingliang Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liyao Song.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, L., Wang, Q., Li, H. et al. Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network. Neural Process Lett 53, 2701–2714 (2021). https://doi.org/10.1007/s11063-021-10520-y

Download citation

Accepted: 16 April 2021
Published: 24 April 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11063-021-10520-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network

Abstract

Access this article

Similar content being viewed by others

Efficient Video Deblurring Guided by Motion Magnitude

Blind motion deblurring with cycle generative adversarial networks

Spatio-Temporal Deformable Attention Network for Video Deblurring

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network

Abstract

Access this article

Similar content being viewed by others

Efficient Video Deblurring Guided by Motion Magnitude

Blind motion deblurring with cycle generative adversarial networks

Spatio-Temporal Deformable Attention Network for Video Deblurring

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation