当前位置: X-MOL 学术IETE J. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Human Interaction Recognition in Videos with Body Pose Traversal Analysis and Pairwise Interaction Framework
IETE Journal of Research ( IF 1.3 ) Pub Date : 2020-08-10 , DOI: 10.1080/03772063.2020.1802355
Amit Verma 1 , Toshanlal Meenpal 1 , Bibhudendra Acharya 1
Affiliation  

Interaction recognition in videos with body pose is gaining remarkable attention due to its speed and robustness. Recently proposed recurrent neural network (RNN) and deep ConvNets-based methods are showing good performances in learning sequential information. Despite these good performances, RNN lags behind in learning spatial relation between body parts, while deep ConvNets requires huge amount of data for training. We propose a traversal-based three-layer neural network (TNN), followed by pairwise interaction framework (PIF) for interaction recognition. We also propose a novel algorithm for tracking humans in successive frames. The proposed algorithm computes collective traversal of individual body parts across the frames and feeds to TNN to learn effective representation of complex actions. The PIF model combines confidence scores of a pair of action labels corresponding to an interaction for final interaction prediction. We evaluate the approach on two publicly available datasets i.e. UT-Interaction and SBU Kinect Interaction. Results show that our proposed approach outperforms the state-of-the-art methods.



中文翻译:

使用身体姿势遍历分析和成对交互框架的视频中的人类交互识别

具有身体姿势的视频中的交互识别由于其速度和鲁棒性而受到极大关注。最近提出的递归神经网络 (RNN) 和基于深度卷积神经网络的方法在学习顺序信息方面表现出良好的性能。尽管有这些良好的性能,RNN 在学习身体部位之间的空间关系方面仍然落后,而深度 ConvNets 需要大量数据进行训练。我们提出了一种基于遍历的三层神经网络 (TNN),然后是用于交互识别的成对交互框架 (PIF)。我们还提出了一种用于在连续帧中跟踪人类的新算法。所提出的算法计算跨帧的各个身体部位的集体遍历,并将其馈送到 TNN 以学习复杂动作的有效表示。PIF 模型结合了与交互对应的一对动作标签的置信度分数,用于最终的交互预测。我们在两个公开可用的数据集上评估该方法UT-Interaction 和 SBU Kinect Interaction。结果表明,我们提出的方法优于最先进的方法。

更新日期:2020-08-10
down
wechat
bug