当前位置: X-MOL 学术Signal Process. Image Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Human action recognition toward massive-scale sport sceneries based on deep multi-model feature fusion
Signal Processing: Image Communication ( IF 3.4 ) Pub Date : 2020-01-23 , DOI: 10.1016/j.image.2020.115802
Ersan Zhou , Heqing Zhang

In sport sceneries, automatically recognizing human actions is a useful technique that can be popularly applied in may domains, such as human body tracking and athlete behavior analysis Most state-of-the-art deep architectures have achieved competitive performance in recognizing human action. However, it is still a challenging task due to the unavoidable occlusion, camera angle changes, and varied human posture. In this paper, we propose a novel deep multimodal feature fusion algorithm for human action recognition. The key technique is a multi-model feature fusion scheme. More specifically, we fuse visual feature, skeleton posture, probability maps and audio signal into a hybrid feature, which is utilized to represent human action. Then these feature channels are optimally combined using a deep model, wherein the weights of multiple feature channels can be predicted intelligently. Finally, the optimally fused feature are fed into a multi-class SVM for conducting human action recognition. Extensive comparative results and parameter analysis have shown the effectiveness of our proposed method.



中文翻译:

基于深度多模型特征融合的大规模运动场景人类动作识别

在运动场景中,自动识别人的动作是一种有用的技术,可以广泛地应用于可能的领域,例如人体跟踪和运动员行为分析。大多数最先进的深度架构在识别人的动作方面都具有竞争优势。但是,由于不可避免的遮挡,相机角度变化和变化的人体姿势,这仍然是一项艰巨的任务。在本文中,我们提出了一种新颖的深度多峰特征融合算法,用于人类动作识别。关键技术是多模型特征融合方案。更具体地说,我们将视觉特征,骨骼姿势,概率图和音频信号融合到一个混合特征中,该混合特征用于表示人类行为。然后使用深度模型对这些特征通道进行最佳组合,其中多个特征通道的权重可以智能预测。最后,将最佳融合特征输入到多类支持向量机中,以进行人体动作识别。大量的比较结果和参数分析表明了该方法的有效性。

更新日期:2020-03-22
down
wechat
bug