当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DeepSegment: Segmentation of motion capture data using deep convolutional neural network
Image and Vision Computing ( IF 4.7 ) Pub Date : 2021-02-28 , DOI: 10.1016/j.imavis.2021.104147
Hashim Yasin , Saqib Hayat

In this paper, we propose a novel framework to segment 3D human motion capture data into distinct behaviors. First, in preprocessing, we build a normalized pose space by eliminating translation and orientation from the 3D poses. We then transform these normalized 3D poses into 2D RGB images, and as a result, we simplify the task of motion segmentation as image classification and recognition. Furthermore, we identify the most significant joints of the skeleton that contribute substantially to executing a motion and get benefits from them by assigning them more weights. The weight allocation to the specific joint has been done purely based on its deviation capability. Finally, each motion is encoded into compact visual representation by exploiting RGB images with weighted joints. We adopt a transfer learning approach to extract a fixed-size feature vector using off-the-shelf deep Convolutional Neural Network (CNN), Alexnet, after fine-tuning. We develop a Kd-tree on these highly descriptive feature vectors to retrieve the nearest neighbors. Based on a similarity measure, we classify the motion segments and ultimately place the cuts on the ongoing motion sequences. We perform extensive experiments to evaluate our proposed approach on popular Motion Capture (MoCap) datasets, CMU and HDM05. Our approach almost outperforms all other state-of-the-art methods, and the results highlight the capabilities of our proposed scheme for effective segmentation.



中文翻译:

DeepSegment:使用深度卷积神经网络对运动捕获数据进行分割

在本文中,我们提出了一个新颖的框架来将3D人体运动捕获数据细分为不同的行为。首先,在预处理中,我们通过消除3D姿势的平移和方向来构建归一化姿势空间。然后,我们将这些归一化的3D姿势转换为2D RGB图像,结果,我们简化了运动分割的任务,即图像分类和识别。此外,我们确定了骨骼中最重要的关节,这些关节对执行动作有重大贡献,并通过为其分配更多的权重而从中受益。纯粹根据其偏差能力来完成对特定关节的权重分配。最后,通过利用带有加权关节的RGB图像,将每个运动编码为紧凑的视觉表示形式。在微调之后,我们采用转移学习方法,使用现成的深度卷积神经网络(CNN),Alexnet提取固定大小的特征向量。我们在这些高度描述性的特征向量上开发了Kd树,以检索最近的邻居。基于相似性度量,我们对运动片段进行分类,并最终将剪切片段放置在进行中的运动序列上。我们进行了广泛的实验,以对流行的运动捕捉(MoCap)数据集CMU和HDM05评估我们提出的方法。我们的方法几乎胜过所有其他最新技术,并且结果突出了我们提出的方案进行有效分割的能力。我们在这些高度描述性的特征向量上开发了Kd树,以检索最近的邻居。基于相似性度量,我们对运动片段进行分类,并最终将剪切片段放置在进行中的运动序列上。我们进行了广泛的实验,以对流行的运动捕捉(MoCap)数据集CMU和HDM05评估我们提出的方法。我们的方法几乎胜过所有其他最新技术,并且结果突出了我们提出的方案进行有效分割的能力。我们在这些高度描述性的特征向量上开发了Kd树,以检索最近的邻居。基于相似性度量,我们对运动片段进行分类,并最终将剪切片段放置在进行中的运动序列上。我们进行了广泛的实验,以对流行的运动捕捉(MoCap)数据集CMU和HDM05评估我们提出的方法。我们的方法几乎胜过所有其他最新技术,并且结果突出了我们提出的方案进行有效分割的能力。

更新日期:2021-03-16
down
wechat
bug