当前位置: X-MOL 学术Pers. Ubiquitous Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Camera motion detection for story and multimedia information convergence
Personal and Ubiquitous Computing ( IF 3.006 ) Pub Date : 2021-06-18 , DOI: 10.1007/s00779-021-01585-6
Hui-Yong Bak , Seung-Bo Park

The motion of the camera in a video effectively conveys to the viewers the intention of the director, and is an essential element that enhances their interest. Therefore, detecting the motion of the camera is a very important factor in movie analysis. Existing research to detect the motion of the camera in a video has mainly focused on pan, tilt, and zoom. However, movies use more diverse camera motions to represent complex and varied emotions. Recognizing only pan, tilt, and zoom in a movie has limitations, especially not being able to detect lateral and longitudinal movements of the camera. In this study, a method is proposed to additionally detect boom and truck as well as pan, tilt, and zoom by using deep learning technology to improve this recognition ability. Thus, this study proposes the Improved Extractor of Camera Motion along with the CNN-Based Detector. The Improved Extractor of Camera Motion uses optical flow to extract camera motion vectors from video at eight-frame intervals. The CNN-Based Detector identifies five camera motions by using ResNet-152. As a result, the performance of our proposed method shows accuracy of 86.2%.



中文翻译:

用于故事和多媒体信息融合的摄像机运动检测

视频中摄像机的运动有效地向观众传达了导演的意图,是增强他们兴趣的重要元素。因此,检测摄像机的运动是电影分析中一个非常重要的因素。现有的检测视频中摄像机运动的研究主要集中在平移、倾斜和变焦上。然而,电影使用更多样化的相机动作来表现复杂多样的情感。仅识别电影中的平移、倾斜和缩放具有局限性,尤其是无法检测摄像机的横向和纵向移动。在这项研究中,提出了一种方法,通过使用深度学习技术来额外检测动臂和卡车以及平移、倾斜和缩放,以提高这种识别能力。因此,这项研究提出了改进的相机运动提取器以及基于 CNN 的检测器。改进的相机运动提取器使用光流从八帧间隔的视频中提取相机运动矢量。基于 CNN 的检测器使用 ResNet-152 识别五个相机运动。结果,我们提出的方法的性能显示出 86.2% 的准确率。

更新日期:2021-06-18
down
wechat
bug