Skip to main content
Log in

Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Video-deblurring has achieved excellent results by using deep learning approaches. How to capture the dynamic spatio-temporal information in the videos is crucial on deblurring. In this paper, we propose a two-stream DeblurGAN which combines a 3D stream with a 2D stream to deblur. The 3D convolution provides spatial and temporal invariance to restore the foreground of frames, while the 2D convolution is sufficient to deal with spatial features, given a relatively consistant background. Thus, our model takes advantage of the great processing power of the 3D stream to handle the foreground which usually contains more dynamical motion blur, and the advantage of the simplicity of the 2D stream to handle the mostly consistent background. We have the full advantage of combining both the 3D convolution and the 2D convolution. Then we take the two-stream model as the generator and adopt the adversarial learning. We test our model on the VideoDeblurring and GOPRO datasets, and compare with other methods which we have listed. Our method outperforms others in the Peak Signal-to-Noise Ratio (PSNR), especially shows a good performance handling the foreground with obvious motion blur.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Bahat Y, Efrat N, Irani M (2017) Non-uniform blind deblurring by reblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:3306–3314

    Google Scholar 

  2. Cho S, Wang J (2012) Lee S (2012) Video deblurring for hand-held cameras using patch-based synthesis. ACM Trans on Graph (TOG) 31(4):64

    Article  Google Scholar 

  3. Dong J, Pan J, Su Z et al (2017) Blind image deblurring with outlier handling. IEEE Comput Vis Pattern Recognit (CVPR) 2017:2497–2505

    Google Scholar 

  4. Fu S, Liu W, Tao D et al (2019) HesGCN: hessian graph convolutional networks for semi-supervised classification. Inform Sci 2019:514

    MATH  Google Scholar 

  5. Fu S, Liu W, Zhang K, et al (2021) Semi-supervised classification by graph p-laplacian convolutional networks. Inform Sci, 2021

  6. Gong D, Tan M, Zhang Y et al (2017) Self-paced kernel estimation for robust blind image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1670–1679

    Google Scholar 

  7. Gong D, Yang J, Liu L, et al (2017) From motion blur to motion flow: a deep learning solution for removing heterogeneous motion blur. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017

  8. Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial networks. Advances Neural Infor Proce Sys (NeurIPS) 2014:2672–2680

    Google Scholar 

  9. Gupta A, Joshi N, Zitnick CL et al (2010) Single image deblurring using motion density functions. Euro Conf Comput Vis (ECCV) 2010:171–184

    Google Scholar 

  10. He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: IEEE Comput Vis Pattern Recognit (CVPR), 2016

  11. Hirsch M, Schuler CJ, Harmeling S et al (2011) Fast removal of non-uniform camera shake. IEEE Comput Vis Pattern Recognit (CVPR) 2011:463–470

    Google Scholar 

  12. Hirsch M, Schuler C J, Harmeling S, et al (2011) Fast removal of non-uniform camera shake. In: IEEE Inter Conf Comput Vis (ICCV), 2011

  13. Hongguang Z, Yuchao D, Hongdong L, Piotr K (2019) Deep stacked hierarchical multi-patch network for image deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), June 2019

  14. Ji S, Xu W, Yang M (2013) Yu K (2013) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Analys Machine Intell 35(1):221–231

    Article  Google Scholar 

  15. Jin H, Favaro P, Cipolla R (2005) Visual tracking in the presence of motion blur. Proce IEEE Comput Soci Conf Comput Vis Pattern Recognit 2005:18–25

    Google Scholar 

  16. Jin M, Roth S, Favaro P (2017) Noise-blind image deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017

  17. Krishnan D, Tay T, Fergus R (2011) Blind deconvolution using a normalized sparsity measure. IEEE Comput Vis Pattern Recognit (CVPR) 2011:233–240

    Google Scholar 

  18. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Inter Conf Neural Infor Proce Sys 2012:1097–1105

    Google Scholar 

  19. Kupyn O, Martyniuk T, Wu J et al (2019) DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. IEEE Comput Vis Pattern Recognit (CVPR) 2019:8878–8887

    Google Scholar 

  20. Ledig C, Theis L, Huszar F, et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017

  21. Lee H S, Lee K M (2013) Dense 3d reconstruction from severely blurred images using a single moving camera. In: IEEE Comput Vis Pattern Recognit (CVPR), 2013

  22. Levin A, Weiss Y, Durand F et al (2009) Understanding and evaluating blind deconvolution algorithms. IEEE Comput Vis Pattern Recognit (CVPR) 2009:1964–1971

    Google Scholar 

  23. Liu W, Fu S, Zhou Y et al (2020) (2020) Human activity recognition by manifold regularization based dynamic graph convolutional networks. Neurocomputing

  24. Mahesh MM, Rajagopalan AN (2017) Going unconstrained with rolling shutter deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:4030–4038

    Google Scholar 

  25. Marquina A, Osher S (2000) Explicit algorithms for a new time dependent model based on level set motion for nonlinear deblurring and noise removal. SIAM J Scient Comput 2000:387–405

    Article  MathSciNet  Google Scholar 

  26. Orest K, Volodymyr B, Mykola M et al (2018) (2018) Deblurgan: blind motion deblurring using conditional adversarial networks. IEEE Comput Vis Pattern Recognit (CVPR) 2018:8183–8192

    Google Scholar 

  27. Pan J, Hu Z, Su Z et al (2014) Deblurring text images via L0-regularized intensity and gradient prior. Comput Vision Pattern Recognit 2014:2901–2908

    Google Scholar 

  28. Pan J, Dong J, Tai YW et al (2017) Learning discriminative data fitting functions for blind image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1077–1085

    Google Scholar 

  29. Park SH, Levoy M (2014) Gyro-based multi-image deconvolution for removing handshake blur. IEEE Comput Vis Pattern Recognit (CVPR) 2014:3366–3373

    Google Scholar 

  30. Ren W, Pan J, Cao X et al (2017) Video-deblurring via semantic segmentation and pixel-wise non-linear kernel. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1086–1094

    Google Scholar 

  31. Ren S, He K, Girshick R, et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Adv Neural Infor Proce Sys (NeurIPS), 2015

  32. Sellent A, Rother C, Roth S (2016) Stereo video deblurring. In: Euro Conf Comput Vis (ECCV), 2016

  33. Seungjun N, Sanghyun S, Kyoung ML (2019) Recurrent neural networks with intra-frame iterations for video deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), June 2019

  34. Shi H, Zhang Y, Zhang Z, et al (2018) Hypergraph-induced convolutional networks for visual classification. IEEE Trans Neur Networks Learn Sys, 2018, pp:1-10

  35. Simonyan K, Zisserman A (2014) Two-Stream Convolutional Networks for Action Recognition in Videos. In: Advances Neural Infor Proce Sys (NeurIPS), 2014

  36. Srinivasan P P, Ng R, Ramamoorthi R (2017) Light field blind motion deblurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2017

  37. Su S, Delbracio M, Wang J et al (2017) Deep video-deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2017:1279–1288

    Google Scholar 

  38. Sun L, Cho S, Wang J et al (2013) Edge-based blur kernel estimation using patch priors. IEEE Inter Conf Comput Photography 2013:1–8

    Google Scholar 

  39. Sun J, Cao W, Xu Z et al (2015) Learning a convolutional neural network for non-uniform motion blur removal. IEEE Comput Vis Pattern Recognit (CVPR) 2015:769–777

    Google Scholar 

  40. Tai YW, Du H, Brown MS, et al (2008) Image/video-deblurring using a hybrid camera. In: IEEE Comput Vis Pattern Recognit (CVPR), 2008

  41. Vondrick C, Pirsiavash H, Torralba A (2016) Generating Videos with Scene Dynamics. In: Advances Neural Infor Proce Sys (NeurIPS), 2016

  42. Wang T, Zhang X, Jiang R (2020) Video deblurring via spatiotemporal pyramid network and adversarial gradient prior. Comput Vis Image Understand 203:103135

    Article  Google Scholar 

  43. Wu L, Wang Y, Shao L (2018) Cycle-consistent deep generative hashing for cross-modal retrieval. IEEE Trans Image Process, 2018, pp (99):1-1

  44. Xu L, Zheng S, Jia J (2013) Unnatural L0 sparse representation for natural image deblurring. IEEE Comput Vis Pattern Recognit (CVPR) 2013:1107–1114

    Google Scholar 

  45. Xu L, Ren J, Liu C, Jia J (2014) Deep convolutional neural network for image deconvolution. In: Advances Neural Infor Proce Sys (NeurIPS), pp 1790–1798

  46. Yan Y, Ren W, Guo Y et al (2017) Image deblurring via extreme channels prior. IEEE Comput Vis Pattern Recognit (CVPR) 2017:6978–6986

    Google Scholar 

  47. Yang F, Xiao L, Yang J (2020) Video deblurring Via 3d CNN and fourier accumulation learning. IEEE Inter Conf Acoust Speech Sigl Process (ICASSP). 2020:2443–2447

    Google Scholar 

  48. Yu Z, Xu D, Yu J (2019) Activitynet-QA: a dataset for understanding complex web videos via question answering. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 2019(33):9127–9134

    Article  Google Scholar 

  49. Yu J, Li J, Yu Z (2019) Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans Circuits and Sys for Video Tech, 2019

  50. Yu T, Yu J, Yu Z (2020) Long-term video question answering via multimodal hierarchical memory attentive networks. IEEE Trans Circuits and Sys for Video Tech, 2020, PP(99):1-1

  51. Zhang K, Luo W, Zhong Y et al (2018) (2018) Adversarial spatio-temporal learning for video-deblurring. IEEE Trans Image Process 28(1):291–301

    Article  Google Scholar 

  52. Zhang K, Luo W, Zhong Y (2020) Deblurring by realistic blurring. In: IEEE Comput Vis Pattern Recognit (CVPR), 2020

  53. Zhou Y, Sun X, Zha ZJ et al (2018) Mict: Mixed 3d/2d convolutional tube for human action recognition. IEEE Comput Vis Pattern Recognit (CVPR) 2018:449–458

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liyao Song.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, L., Wang, Q., Li, H. et al. Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network. Neural Process Lett 53, 2701–2714 (2021). https://doi.org/10.1007/s11063-021-10520-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10520-y

Keywords

Navigation