当前位置: X-MOL 学术J. Visual Commun. Image Represent. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Relative view based holistic-separate representations for two-person interaction recognition using multiple graph convolutional networks
Journal of Visual Communication and Image Representation ( IF 2.6 ) Pub Date : 2020-05-21 , DOI: 10.1016/j.jvcir.2020.102833
Xing Liu , Yanshan Li , Tianyu Guo , Rongjie Xia

In this paper, we focus on recognizing person-person interactions using skeletal data captured from depth sensors. First, we propose a novel and efficient view transformation scheme. The skeletal interaction sequence is re-observed under a new coordinate system, which is invariant to various setups and capturing views of depth cameras as well as the position or facing orientation exchange between two persons. Second, we propose concise and discriminative interaction representations simply composed of the joint locations from two persons. Proposed representations are efficient to describe both the holistic interactive scene and individual poses performed by each subject separately. Third, we introduce the graph convolutional networks(GCN) to directly learn proposed skeletal interaction representations. Moreover, we design a multiple GCN-based model to provide the final class score. Extensive experimental results on three skeletal action datasets NTU RGB+D 60, NTU RGB+D 120 and SBU consistently demonstrate the superiority of our interaction recognition method.



中文翻译:

使用多个图卷积网络的基于相对视图的整体分离表示形式的两人互动识别

在本文中,我们专注于使用从深度传感器捕获的骨骼数据识别人与人的交互。首先,我们提出了一种新颖而有效的视图转换方案。在新的坐标系下重新观察骨骼交互序列,该坐标系对于各种设置和捕获深度摄像机的视图以及两个人之间的位置或朝向交换不变。其次,我们提出了简洁而有区别的交互表示,该表示仅由两个人的联合位置组成。拟议的陈述有效地描述了整体互动场景和每个主题分别执行的单个姿势。第三,我们引入图卷积网络(GCN)直接学习提议的骨骼交互表示。此外,我们设计了多个基于GCN的模型来提供最终的课程成绩。在三个骨骼动作数据集NTU RGB + D 60,NTU RGB + D 120和SBU上的大量实验结果始终证明了我们的交互识别方法的优越性。

更新日期:2020-05-21
down
wechat
bug