当前位置: X-MOL 学术Multimedia Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fusion of spatial and dynamic CNN streams for action recognition
Multimedia Systems ( IF 3.5 ) Pub Date : 2021-03-23 , DOI: 10.1007/s00530-021-00773-x
Newlin Shebiah Russel , Arivazhagan Selvaraj

Human action recognition is widely explored as it finds varied applications including visual navigation, surveillance, video indexing, biometrics, human–computer interaction, ambient assisted living, etc. This paper aims to design and analyze the performance of Spatial and Temporal CNN streams for action recognition from videos. An action video is fragmented into a predefined number of segments called snippets. For each segment, atomic poses portrayed by the individual is effectively captured by the representative frame and dynamics of the action is well described by the dynamic image. The representative frames and dynamic images are separately trained by Convolutional Neural Network for further analysis. The attained results on KTH, Weizmann UCF Sports and UCF101 datasets ascertain the efficiency of the proposed architecture for action recognition.



中文翻译:

融合空间和动态CNN流以进行动作识别

人们广泛地探索人类动作识别,因为它发现了多种应用,包括视觉导航,监视,视频索引,生物识别,人机交互,环境辅助生活等。本文旨在设计和分析时空CNN动作流的性能从视频中识别。动作视频被片段化为预定义数量的片段,称为片段。对于每个片段,代表帧有效地捕获了由个人描绘的原子姿势,而动态图像很好地描述了动作的动力学。卷积神经网络分别训练了代表帧和动态图像,以进行进一步分析。在KTH上取得的成绩,

更新日期:2021-03-23
down
wechat
bug