Abstract
Video-deblurring has achieved excellent results by using deep learning approaches. How to capture the dynamic spatio-temporal information in the videos is crucial on deblurring. In this paper, we propose a two-stream DeblurGAN which combines a 3D stream with a 2D stream to deblur. The 3D convolution provides spatial and temporal invariance to restore the foreground of frames, while the 2D convolution is sufficient to deal with spatial features, given a relatively consistant background. Thus, our model takes advantage of the great processing power of the 3D stream to handle the foreground which usually contains more dynamical motion blur, and the advantage of the simplicity of the 2D stream to handle the mostly consistent background. We have the full advantage of combining both the 3D convolution and the 2D convolution. Then we take the two-stream model as the generator and adopt the adversarial learning. We test our model on the VideoDeblurring and GOPRO datasets, and compare with other methods which we have listed. Our method outperforms others in the Peak Signal-to-Noise Ratio (PSNR), especially shows a good performance handling the foreground with obvious motion blur.
Similar content being viewed by others
References
Bahat Y, Efrat N, Irani M (2017) Non-uniform blind deblurring by reblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:3306–3314
Cho S, Wang J (2012) Lee S (2012) Video deblurring for hand-held cameras using patch-based synthesis. ACM Trans on Graph (TOG) 31(4):64
Dong J, Pan J, Su Z et al (2017) Blind image deblurring with outlier handling. IEEE Comput Vis Pattern Recognit (CVPR) 2017:2497–2505
Fu S, Liu W, Tao D et al (2019) HesGCN: hessian graph convolutional networks for semi-supervised classification. Inform Sci 2019:514
Fu S, Liu W, Zhang K, et al (2021) Semi-supervised classification by graph p-laplacian convolutional networks. Inform Sci, 2021
Gong D, Tan M, Zhang Y et al (2017) Self-paced kernel estimation for robust blind image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1670–1679
Gong D, Yang J, Liu L, et al (2017) From motion blur to motion flow: a deep learning solution for removing heterogeneous motion blur. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial networks. Advances Neural Infor Proce Sys (NeurIPS) 2014:2672–2680
Gupta A, Joshi N, Zitnick CL et al (2010) Single image deblurring using motion density functions. Euro Conf Comput Vis (ECCV) 2010:171–184
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: IEEE Comput Vis Pattern Recognit (CVPR), 2016
Hirsch M, Schuler CJ, Harmeling S et al (2011) Fast removal of non-uniform camera shake. IEEE Comput Vis Pattern Recognit (CVPR) 2011:463–470
Hirsch M, Schuler C J, Harmeling S, et al (2011) Fast removal of non-uniform camera shake. In: IEEE Inter Conf Comput Vis (ICCV), 2011
Hongguang Z, Yuchao D, Hongdong L, Piotr K (2019) Deep stacked hierarchical multi-patch network for image deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), June 2019
Ji S, Xu W, Yang M (2013) Yu K (2013) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Analys Machine Intell 35(1):221–231
Jin H, Favaro P, Cipolla R (2005) Visual tracking in the presence of motion blur. Proce IEEE Comput Soci Conf Comput Vis Pattern Recognit 2005:18–25
Jin M, Roth S, Favaro P (2017) Noise-blind image deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Krishnan D, Tay T, Fergus R (2011) Blind deconvolution using a normalized sparsity measure. IEEE Comput Vis Pattern Recognit (CVPR) 2011:233–240
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Inter Conf Neural Infor Proce Sys 2012:1097–1105
Kupyn O, Martyniuk T, Wu J et al (2019) DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. IEEE Comput Vis Pattern Recognit (CVPR) 2019:8878–8887
Ledig C, Theis L, Huszar F, et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Lee H S, Lee K M (2013) Dense 3d reconstruction from severely blurred images using a single moving camera. In: IEEE Comput Vis Pattern Recognit (CVPR), 2013
Levin A, Weiss Y, Durand F et al (2009) Understanding and evaluating blind deconvolution algorithms. IEEE Comput Vis Pattern Recognit (CVPR) 2009:1964–1971
Liu W, Fu S, Zhou Y et al (2020) (2020) Human activity recognition by manifold regularization based dynamic graph convolutional networks. Neurocomputing
Mahesh MM, Rajagopalan AN (2017) Going unconstrained with rolling shutter deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:4030–4038
Marquina A, Osher S (2000) Explicit algorithms for a new time dependent model based on level set motion for nonlinear deblurring and noise removal. SIAM J Scient Comput 2000:387–405
Orest K, Volodymyr B, Mykola M et al (2018) (2018) Deblurgan: blind motion deblurring using conditional adversarial networks. IEEE Comput Vis Pattern Recognit (CVPR) 2018:8183–8192
Pan J, Hu Z, Su Z et al (2014) Deblurring text images via L0-regularized intensity and gradient prior. Comput Vision Pattern Recognit 2014:2901–2908
Pan J, Dong J, Tai YW et al (2017) Learning discriminative data fitting functions for blind image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1077–1085
Park SH, Levoy M (2014) Gyro-based multi-image deconvolution for removing handshake blur. IEEE Comput Vis Pattern Recognit (CVPR) 2014:3366–3373
Ren W, Pan J, Cao X et al (2017) Video-deblurring via semantic segmentation and pixel-wise non-linear kernel. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1086–1094
Ren S, He K, Girshick R, et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Adv Neural Infor Proce Sys (NeurIPS), 2015
Sellent A, Rother C, Roth S (2016) Stereo video deblurring. In: Euro Conf Comput Vis (ECCV), 2016
Seungjun N, Sanghyun S, Kyoung ML (2019) Recurrent neural networks with intra-frame iterations for video deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), June 2019
Shi H, Zhang Y, Zhang Z, et al (2018) Hypergraph-induced convolutional networks for visual classification. IEEE Trans Neur Networks Learn Sys, 2018, pp:1-10
Simonyan K, Zisserman A (2014) Two-Stream Convolutional Networks for Action Recognition in Videos. In: Advances Neural Infor Proce Sys (NeurIPS), 2014
Srinivasan P P, Ng R, Ramamoorthi R (2017) Light field blind motion deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017
Su S, Delbracio M, Wang J et al (2017) Deep video-deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1279–1288
Sun L, Cho S, Wang J et al (2013) Edge-based blur kernel estimation using patch priors. IEEE Inter Conf Comput Photography 2013:1–8
Sun J, Cao W, Xu Z et al (2015) Learning a convolutional neural network for non-uniform motion blur removal. IEEE Comput Vis Pattern Recognit (CVPR) 2015:769–777
Tai YW, Du H, Brown MS, et al (2008) Image/video-deblurring using a hybrid camera. In: IEEE Comput Vis Pattern Recognit (CVPR), 2008
Vondrick C, Pirsiavash H, Torralba A (2016) Generating Videos with Scene Dynamics. In: Advances Neural Infor Proce Sys (NeurIPS), 2016
Wang T, Zhang X, Jiang R (2020) Video deblurring via spatiotemporal pyramid network and adversarial gradient prior. Comput Vis Image Understand 203:103135
Wu L, Wang Y, Shao L (2018) Cycle-consistent deep generative hashing for cross-modal retrieval. IEEE Trans Image Process, 2018, pp (99):1-1
Xu L, Zheng S, Jia J (2013) Unnatural L0 sparse representation for natural image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2013:1107–1114
Xu L, Ren J, Liu C, Jia J (2014) Deep convolutional neural network for image deconvolution. In: Advances Neural Infor Proce Sys (NeurIPS), pp 1790–1798
Yan Y, Ren W, Guo Y et al (2017) Image deblurring via extreme channels prior. IEEE Comput Vis Pattern Recognit (CVPR) 2017:6978–6986
Yang F, Xiao L, Yang J (2020) Video deblurring Via 3d CNN and fourier accumulation learning. IEEE Inter Conf Acoust Speech Sigl Process (ICASSP). 2020:2443–2447
Yu Z, Xu D, Yu J (2019) Activitynet-QA: a dataset for understanding complex web videos via question answering. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 2019(33):9127–9134
Yu J, Li J, Yu Z (2019) Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans Circuits and Sys for Video Tech, 2019
Yu T, Yu J, Yu Z (2020) Long-term video question answering via multimodal hierarchical memory attentive networks. IEEE Trans Circuits and Sys for Video Tech, 2020, PP(99):1-1
Zhang K, Luo W, Zhong Y et al (2018) (2018) Adversarial spatio-temporal learning for video-deblurring. IEEE Trans Image Process 28(1):291–301
Zhang K, Luo W, Zhong Y (2020) Deblurring by realistic blurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2020
Zhou Y, Sun X, Zha ZJ et al (2018) Mict: Mixed 3d/2d convolutional tube for human action recognition. IEEE Comput Vis Pattern Recognit (CVPR) 2018:449–458
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Song, L., Wang, Q., Li, H. et al. Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network. Neural Process Lett 53, 2701–2714 (2021). https://doi.org/10.1007/s11063-021-10520-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-021-10520-y