Abstract
The motion of the camera in a video effectively conveys to the viewers the intention of the director, and is an essential element that enhances their interest. Therefore, detecting the motion of the camera is a very important factor in movie analysis. Existing research to detect the motion of the camera in a video has mainly focused on pan, tilt, and zoom. However, movies use more diverse camera motions to represent complex and varied emotions. Recognizing only pan, tilt, and zoom in a movie has limitations, especially not being able to detect lateral and longitudinal movements of the camera. In this study, a method is proposed to additionally detect boom and truck as well as pan, tilt, and zoom by using deep learning technology to improve this recognition ability. Thus, this study proposes the Improved Extractor of Camera Motion along with the CNN-Based Detector. The Improved Extractor of Camera Motion uses optical flow to extract camera motion vectors from video at eight-frame intervals. The CNN-Based Detector identifies five camera motions by using ResNet-152. As a result, the performance of our proposed method shows accuracy of 86.2%.
Similar content being viewed by others
References
Brown B (2012) Cinematography: theory and practice, 2nd ed. Elsevier, MA, USA
Vineyard J (2008) Setting up your shots, 2nd ed. Michael Wiese, CA, USA
N. Nguyen, D. Laurendeau, A. Albu. “A robust method for camera motion estimation in movies based on optical flow,” The 6th International Conference on Information Technology and Applications, 2009.
Almeida J, Minetto R, Almeida TA, Torres R DS, Leite NJ (2009) Robust estimation of camera motion using optical flow models. In: Bebis G et al (eds) Advances in visual computing. ISVC 2009. Lecture Notes in Computer Science, vol 5875. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10331-541
R. Minetto, N. J. Leite, and J. Stolfi, “Reliable detection of camera motion based on weighted optical flow fitting,” in VISAPP, 2007, pp. 435–440
Guo X, Jiang G, Cui Z, Tao P (2016) Homography-based block motion estimation for video coding of PTZ cameras. J Vis Commun Image Represent 39(1):164–171
Weng Y, Jiang J (August 2011) Fast camera motion estimation in MPEG compressed domain. in IEEE Transactions on Consumer Electronics 57(3):1329–1335. https://doi.org/10.1109/TCE.2011.6018891
Prasertsakul P, Kondo T, Iida H, Phatrapornnant T (2020) Camera operation estimation from video shot using 2D motion vector histogram. Multimed Tools Appl 79:17403–17426. https://doi.org/10.1007/s11042-019-08378-3
Benito-Picazo, Jesús et al. ‘Motion detection with low cost hardware for PTZ cameras’. 1 Jan. 2019 : 21 – 36.
A. Heidarian and M. J. Dinneen, “A hybrid geometric approach for measuring similarity level among documents and document clustering,” 2016 IEEE Second International Conference on Big Data Computing Service and Applications (BigDataService), Oxford, 2016, pp. 142-151, doi:10.1109/BigDataService.2016.14.
Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381–385
Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17:p1
Beauchemin SS, Barron JL (1995) The computation of optical flow. ACM Comput Surv 27(3):433–467
Baker S, Scharstein D, Lewis J, Roth S, Black M, Szeliski R (2011) A database and evaluation methodology for optical flow. IJCV 92(1):1–31
A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR
Muhammad, U.; Wang, W.; Chattha, S.P.; Ali, S. Pre-trained VGGNet architecture for remote-sensing image scene classification. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 1622–1627.
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. CVPR:770–778
Li B, He Y (2018) An improved ResNet based on the adjustable shortcut connections. IEEE Access 6:18967–18974
Aurélien Géron Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition.; O’Reilly Media, Inc, CA, USA, 2018.
Olson, D. L., & Delen, D. (2008). Advanced data mining techniques. Springer Science & Business Media.
Funding
This work was supported by an INHA UNIVERSITY Research Grant (INHA-63134).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bak, HY., Park, SB. Camera motion detection for story and multimedia information convergence. Pers Ubiquit Comput 27, 1221–1231 (2023). https://doi.org/10.1007/s00779-021-01585-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-021-01585-6