Multiperson interaction recognition in images: A body keypoint based feature image analysis,Computational Intelligence

当前位置： X-MOL 学术 › Comput. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multiperson interaction recognition in images: A body keypoint based feature image analysis
Computational Intelligence ( IF 2.8 ) Pub Date : 2020-10-28 , DOI: 10.1111/coin.12419
Amit Verma ₁ , Toshanlal Meenpal ₁ , Bibhudendra Acharya ₁

Affiliation

Most interaction recognition approaches have been limited to single‐person action classification in videos. However, for still images where motion information is not available, the task becomes more complex. Aiming to this point, we propose an approach for multiperson human interaction recognition in images with keypoint‐based feature image analysis. Proposed method is a three‐stage framework. In the first stage, we propose feature‐based neural network (FCNN) for action recognition trained with feature images. Feature images are body features, that is, effective distances between a set of body part pairs and angular relation between body part triplets, rearranged in 2D gray‐scale image to learn effective representation of complex actions. In the later stage, we propose a voting‐based method for direction encoding to anticipate probable motion in steady images. Finally, our multiperson interaction recognition algorithm identifies which human pairs are interacting with each other using an interaction parameter. We evaluate our approach on two real‐world data sets, that is, UT‐interaction and SBU kinect interaction. The empirical experiments show that results are better than the state‐of‐the‐art methods with recognition accuracy of 95.83% on UT‐I set 1, 92.5% on UT‐I set 2, and 94.28% on SBU clean data set.

中文翻译：

图像中的多人交互识别：基于人体关键点的特征图像分析

大多数交互识别方法仅限于视频中的单人动作分类。但是，对于没有运动信息的静止图像，任务变得更加复杂。针对这一点，我们提出了一种基于关键点特征图像分析的图像多人交互识别方法。建议的方法是一个三阶段框架。在第一阶段，我们提出了基于特征的神经网络（FCNN），用于使用特征图像训练的动作识别。特征图像是人体特征，即一组身体部位对之间的有效距离和身体部位三联体之间的角度关系，在2D灰度图像中进行了重新排列，以学习复杂动作的有效表示。在后期阶段我们提出了一种基于投票的方向编码方法，以预测稳定图像中的可能运动。最后，我们的多人互动识别算法使用互动参数来识别哪些人对正在互动。我们在两个实际数据集上评估我们的方法，即UT交互和SBU kinect交互。经验实验表明，结果优于最新方法，UT-1集1的识别精度为95.83％，UT-1集2的识别精度为92.5％，SBU干净数据集的识别精度为94.28％。

更新日期：2020-10-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>