Camera motion detection for story and multimedia information convergence

Bak, Hui-Yong; Park, Seung-Bo

doi:10.1007/s00779-021-01585-6

Camera motion detection for story and multimedia information convergence

Original Article
Published: 18 June 2021

Volume 27, pages 1221–1231, (2023)
Cite this article

Personal and Ubiquitous Computing Aims and scope Submit manuscript

236 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The motion of the camera in a video effectively conveys to the viewers the intention of the director, and is an essential element that enhances their interest. Therefore, detecting the motion of the camera is a very important factor in movie analysis. Existing research to detect the motion of the camera in a video has mainly focused on pan, tilt, and zoom. However, movies use more diverse camera motions to represent complex and varied emotions. Recognizing only pan, tilt, and zoom in a movie has limitations, especially not being able to detect lateral and longitudinal movements of the camera. In this study, a method is proposed to additionally detect boom and truck as well as pan, tilt, and zoom by using deep learning technology to improve this recognition ability. Thus, this study proposes the Improved Extractor of Camera Motion along with the CNN-Based Detector. The Improved Extractor of Camera Motion uses optical flow to extract camera motion vectors from video at eight-frame intervals. The CNN-Based Detector identifies five camera motions by using ResNet-152. As a result, the performance of our proposed method shows accuracy of 86.2%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 7

A multi-stream CNN for deep violence detection in video sequences using handcrafted features

Article 26 July 2021

Moving Object Detection in Video Sequences Based on a Two-Frame Temporal Information CNN

Article 30 November 2022

A study on deep learning spatiotemporal models and feature extraction techniques for video understanding

Article 24 January 2020

References

Brown B (2012) Cinematography: theory and practice, 2nd ed. Elsevier, MA, USA
Google Scholar
Vineyard J (2008) Setting up your shots, 2nd ed. Michael Wiese, CA, USA
Google Scholar
N. Nguyen, D. Laurendeau, A. Albu. “A robust method for camera motion estimation in movies based on optical flow,” The 6th International Conference on Information Technology and Applications, 2009.
Almeida J, Minetto R, Almeida TA, Torres R DS, Leite NJ (2009) Robust estimation of camera motion using optical flow models. In: Bebis G et al (eds) Advances in visual computing. ISVC 2009. Lecture Notes in Computer Science, vol 5875. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10331-541
Chapter Google Scholar
R. Minetto, N. J. Leite, and J. Stolfi, “Reliable detection of camera motion based on weighted optical flow fitting,” in VISAPP, 2007, pp. 435–440
Guo X, Jiang G, Cui Z, Tao P (2016) Homography-based block motion estimation for video coding of PTZ cameras. J Vis Commun Image Represent 39(1):164–171
Article Google Scholar
Weng Y, Jiang J (August 2011) Fast camera motion estimation in MPEG compressed domain. in IEEE Transactions on Consumer Electronics 57(3):1329–1335. https://doi.org/10.1109/TCE.2011.6018891
Article Google Scholar
Prasertsakul P, Kondo T, Iida H, Phatrapornnant T (2020) Camera operation estimation from video shot using 2D motion vector histogram. Multimed Tools Appl 79:17403–17426. https://doi.org/10.1007/s11042-019-08378-3
Article Google Scholar
Benito-Picazo, Jesús et al. ‘Motion detection with low cost hardware for PTZ cameras’. 1 Jan. 2019 : 21 – 36.
A. Heidarian and M. J. Dinneen, “A hybrid geometric approach for measuring similarity level among documents and document clustering,” 2016 IEEE Second International Conference on Big Data Computing Service and Applications (BigDataService), Oxford, 2016, pp. 142-151, doi:10.1109/BigDataService.2016.14.
Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381–385
Article MathSciNet Google Scholar
Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17:p1
Article MATH Google Scholar
Beauchemin SS, Barron JL (1995) The computation of optical flow. ACM Comput Surv 27(3):433–467
Article Google Scholar
Baker S, Scharstein D, Lewis J, Roth S, Black M, Szeliski R (2011) A database and evaluation methodology for optical flow. IJCV 92(1):1–31
Article Google Scholar
A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR
Muhammad, U.; Wang, W.; Chattha, S.P.; Ali, S. Pre-trained VGGNet architecture for remote-sensing image scene classification. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 1622–1627.
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. CVPR:770–778
Li B, He Y (2018) An improved ResNet based on the adjustable shortcut connections. IEEE Access 6:18967–18974
Article Google Scholar
Aurélien Géron Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition.; O’Reilly Media, Inc, CA, USA, 2018.
Olson, D. L., & Delen, D. (2008). Advanced data mining techniques. Springer Science & Business Media.
MATH Google Scholar

Download references

Funding

This work was supported by an INHA UNIVERSITY Research Grant (INHA-63134).

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Inha University, Incheon, South Korea
Hui-Yong Bak
Department of Software Convergence Engineering, Inha University, Incheon, South Korea
Seung-Bo Park

Authors

Hui-Yong Bak
View author publications
You can also search for this author in PubMed Google Scholar
Seung-Bo Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seung-Bo Park.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bak, HY., Park, SB. Camera motion detection for story and multimedia information convergence. Pers Ubiquit Comput 27, 1221–1231 (2023). https://doi.org/10.1007/s00779-021-01585-6

Download citation

Received: 05 March 2021
Accepted: 08 June 2021
Published: 18 June 2021
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00779-021-01585-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Camera motion detection for story and multimedia information convergence

Abstract

Access this article

Similar content being viewed by others

A multi-stream CNN for deep violence detection in video sequences using handcrafted features

Moving Object Detection in Video Sequences Based on a Two-Frame Temporal Information CNN

A study on deep learning spatiotemporal models and feature extraction techniques for video understanding

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Camera motion detection for story and multimedia information convergence

Abstract

Access this article

Similar content being viewed by others

A multi-stream CNN for deep violence detection in video sequences using handcrafted features

Moving Object Detection in Video Sequences Based on a Two-Frame Temporal Information CNN

A study on deep learning spatiotemporal models and feature extraction techniques for video understanding

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation