当前位置: X-MOL 学术Signal Image Video Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A new 3D convolutional neural network (3D-CNN) framework for multimedia event detection
Signal, Image and Video Processing ( IF 2.3 ) Pub Date : 2020-10-19 , DOI: 10.1007/s11760-020-01796-z
Kaavya Kanagaraj , G. G. Lakshmi Priya

Multimedia event detection has received a great deal of interest due to developments in video technology and an increase in multimedia data. However, complexities of video content such as noisy, overlapping, repeated interaction between individuals, and various scenes are becoming difficult for characterizing the subjects and concepts. In particular, Internet users find it difficult to search for a specified event. To solve the above problem, a method is proposed that best suits for event detection, demonstrating the 3D convolutional neural network (3D-CNN) structure to accomplish promising performance in multimedia event classification. To take an advantage of motion content of the event in the video, temporal axis is considered. Both the feature extraction and classification are incorporated in this model. Experiments are carried out on the Columbia Consumer Video benchmark dataset, and results are compared with other existing works.

中文翻译:

用于多媒体事件检测的新 3D 卷积神经网络 (3D-CNN) 框架

由于视频技术的发展和多媒体数据的增加,多媒体事件检测受到了极大的关注。然而,视频内容的复杂性,例如嘈杂、重叠、个体之间的重复交互以及各种场景,变得难以表征主题和概念。尤其是互联网用户发现很难搜索到指定的事件。为了解决上述问题,提出了一种最适合事件检测的方法,展示了 3D 卷积神经网络 (3D-CNN) 结构,以在多媒体事件分类中实现良好的性能。为了利用视频中事件的运动内容,考虑了时间轴。特征提取和分类都包含在这个模型中。
更新日期:2020-10-19
down
wechat
bug