当前位置: X-MOL 学术IET Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Skeleton-based attention-aware spatial–temporal model for action detection and recognition
IET Computer Vision ( IF 1.5 ) Pub Date : 2020-08-06 , DOI: 10.1049/iet-cvi.2019.0751
Ran Cui 1, 2 , Aichun Zhu 3 , Jingran Wu 2 , Gang Hua 1
Affiliation  

Action detection and recognition are popular subjects of research in the field of computer vision. The task of action detection can be regarded as the sum of action location and recognition. Action features described by using information concerning the human skeleton have the advantages of robustness against external factors and requiring a small amount of calculation. This study proposes a skeleton-based action analysis model based on a recurrent neural network framework. The model learns action features by modelling static and dynamic features of skeleton joints and the importance of different video frames by introducing an attention module. For action location, conditional random field loss function is introduced to establish the context dependency of output labels. In the aspect of action recognition, the hierarchical training mechanism with triple loss models action features at coarse-grained and fine-grained levels. The authors’ proposed method delivers state-of-the-art results on action location and recognition tasks.

中文翻译:

基于骨架的注意力感知时空模型,用于动作检测和识别

动作检测和识别是计算机视觉领域的热门研究主题。动作检测的任务可以看作是动作定位和识别的总和。通过使用有关人体骨骼的信息描述的动作特征具有抵抗外部因素的鲁棒性和需要少量计算的优点。本研究提出了基于递归神经网络框架的基于骨骼的动作分析模型。该模型通过对骨骼关节的静态和动态特征进行建模来学习动作特征,并通过引入注意模块来学习不同视频帧的重要性。对于动作定位,引入条件随机场损失函数以建立输出标签的上下文相关性。在动作识别方面,具有三重损失模型的分级训练机制在粗粒度和细粒度级别上具有动作功能。作者提出的方法可提供有关动作定位和识别任务的最新结果。
更新日期:2020-08-20
down
wechat
bug