Action recognition for sports video analysis using part-attention spatio-temporal graph convolutional network,Journal of Electronic Imaging

当前位置： X-MOL 学术 › J. Electron. Imaging › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Action recognition for sports video analysis using part-attention spatio-temporal graph convolutional network
Journal of Electronic Imaging ( IF 1.1 ) Pub Date : 2021-06-01 , DOI: 10.1117/1.jei.30.3.033017
Jiatong Liu ₁ , Yanli Che ₂

Affiliation

Action recognition makes significant contributions to sports video analysis, especially for athletes’ training evaluations. For sports video analysis, the action information is mainly conveyed by human body parts’ temporal movement, and each of the parts has a unique importance to the action representation. Aiming to involve this point in action recognition, we propose a part-attention spatio-temporal graph convolutional network (PSGCN) to exploit the dynamic spatio-temporal information in a sports video; it learns the importance of different parts to emphasize the contribution on the task of action recognition. Specifically, PSGCN first divides the human body into six parts and extracts their convolutional neural network (CNN) features, as well as concatenating the global feature of the whole frame; it then utilizes a cross-part and cross-frame graph building module to formulate the graph correlation of the parts from different frames. Inspired by the larger temporal variation of the same part containing more action information, we further propose a part-attention (PA) learning module to estimate the importance of each part, which can strengthen the graph correlation to support a PA graph. Finally, PSGCN conducts a graph convolutional network on the learned PA spatio-temporal graph with the learned part CNN features, which can obtain the action representation for the given sports video. In addition, the whole network is optimized by two losses of PA and action classification. To perform the superiority of PSGCN, we carry out extensive experiments of our model compared with several state-of-the-art methods on widely used action recognition datasets, especially for sports action. The results reflect the advantages of the proposed PSGCN on sports video analysis.

中文翻译：

使用部分注意时空图卷积网络进行运动视频分析的动作识别

动作识别对运动视频分析做出了重大贡献，尤其是对于运动员的训练评估。对于体育视频分析，动作信息主要通过人体部位的时间运动来传达，每个部位对动作表示都有独特的重要性。为了在动作识别中涉及这一点，我们提出了一种部分注意时空图卷积网络（PSGCN）来利用体育视频中的动态时空信息；它学习不同部分的重要性以强调对动作识别任务的贡献。具体来说，PSGCN首先将人体分为六个部分，提取它们的卷积神经网络（CNN）特征，并拼接整个帧的全局特征；然后它利用跨部件和跨框架图构建模块来制定来自不同框架的部件的图相关性。受到包含更多动作信息的同一部分更大的时间变化的启发，我们进一步提出了一个部分注意（PA）学习模块来估计每个部分的重要性，这可以加强图的相关性以支持 PA 图。最后，PSGCN 对学习的 PA 时空图进行图卷积网络，利用学习部分的 CNN 特征，可以获得给定体育视频的动作表示。此外，整个网络通过PA和动作分类两个损失进行了优化。为了发挥PSGCN的优越性，我们在广泛使用的动作识别数据集上对我们的模型与几种最先进的方法进行了广泛的实验，特别是对于体育动作。结果反映了所提出的 PSGCN 在体育视频分析方面的优势。

更新日期：2021-06-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>