当前位置: X-MOL 学术ACM Trans. Graph. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PhysCap
ACM Transactions on Graphics  ( IF 6.2 ) Pub Date : 2020-11-27 , DOI: 10.1145/3414685.3417877
Soshi Shimada 1 , Vladislav Golyanik 1 , Weipeng Xu 2 , Christian Theobalt 1
Affiliation  

Marker-less 3D human motion capture from a single colour camera has seen significant progress. However, it is a very challenging and severely ill-posed problem. In consequence, even the most accurate state-of-the-art approaches have significant limitations. Purely kinematic formulations on the basis of individual joints or skeletons, and the frequent frame-wise reconstruction in state-of-the-art methods greatly limit 3D accuracy and temporal stability compared to multi-view or marker-based motion capture. Further, captured 3D poses are often physically incorrect and biomechanically implausible, or exhibit implausible environment interactions (floor penetration, foot skating, unnatural body leaning and strong shifting in depth), which is problematic for any use case in computer graphics. We, therefore, present PhysCap , the first algorithm for physically plausible, real-time and marker-less human 3D motion capture with a single colour camera at 25 fps. Our algorithm first captures 3D human poses purely kinematically. To this end, a CNN infers 2D and 3D joint positions, and subsequently, an inverse kinematics step finds space-time coherent joint angles and global 3D pose. Next, these kinematic reconstructions are used as constraints in a real-time physics-based pose optimiser that accounts for environment constraints ( e.g. , collision handling and floor placement), gravity, and biophysical plausibility of human postures. Our approach employs a combination of ground reaction force and residual force for plausible root control, and uses a trained neural network to detect foot contact events in images. Our method captures physically plausible and temporally stable global 3D human motion, without physically implausible postures, floor penetrations or foot skating, from video in real time and in general scenes. PhysCap achieves state-of-the-art accuracy on established pose benchmarks, and we propose new metrics to demonstrate the improved physical plausibility and temporal stability.

中文翻译:

物理帽

来自单色相机的无标记 3D 人体动作捕捉取得了重大进展。然而,这是一个非常具有挑战性和严重不适定的问题。因此,即使是最准确的最先进的方法也有很大的局限性。与多视图或基于标记的运动捕捉相比,基于单个关节或骨骼的纯运动学公式,以及最新方法中频繁的逐帧重建极大地限制了 3D 精度和时间稳定性。此外,捕获的 3D 姿势通常在物理上不正确且在生物力学上不可信,或者表现出不可信的环境交互(地板穿透、脚滑、不自然的身体倾斜和深度的强烈变化),这对于计算机图形学中的任何用例都是有问题的。因此,我们提出物理帽,第一个使用单色相机以 25 fps 的速度进行物理上合理、实时和无标记的人类 3D 动作捕捉的算法。我们的算法首先以纯粹的运动学方式捕获 3D 人体姿势。为此,CNN 推断 2D 和 3D 关节位置,随后,逆运动学步骤找到时空相干关节角度和全局 3D 姿势。接下来,这些运动学重建被用作基于实时物理的姿态优化器中的约束,该姿态优化器考虑了环境约束(例如、碰撞处理和地板放置)、重力和人体姿势的生物物理合理性。我们的方法采用地面反作用力和残余力的组合来实现合理的根部控制,并使用经过训练的神经网络来检测图像中的脚部接触事件。我们的方法从实时视频和一般场景中捕获物理上合理且时间稳定的全局 3D 人体运动,没有物理上不合理的姿势、地板穿透或滑脚。物理帽在已建立的姿势基准上实现了最先进的准确性,我们提出了新的指标来证明改进的物理合理性和时间稳定性。
更新日期:2020-11-27
down
wechat
bug