Neural Monocular 3D Human Motion Capture with Physical Awareness,arXiv - CS - Graphics

当前位置： X-MOL 学术 › arXiv.cs.GR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Neural Monocular 3D Human Motion Capture with Physical Awareness
arXiv - CS - Graphics Pub Date : 2021-05-03 , DOI: arxiv-2105.01057
Soshi Shimada, Vladislav Golyanik, Weipeng Xu, Patrick Pérez, Christian Theobalt

We present a new trainable system for physically plausible markerless 3D human motion capture, which achieves state-of-the-art results in a broad range of challenging scenarios. Unlike most neural methods for human motion capture, our approach, which we dub physionical, is aware of physical and environmental constraints. It combines in a fully differentiable way several key innovations, i.e., 1. a proportional-derivative controller, with gains predicted by a neural network, that reduces delays even in the presence of fast motions, 2. an explicit rigid body dynamics model and 3. a novel optimisation layer that prevents physically implausible foot-floor penetration as a hard constraint. The inputs to our system are 2D joint keypoints, which are canonicalised in a novel way so as to reduce the dependency on intrinsic camera parameters -- both at train and test time. This enables more accurate global translation estimation without generalisability loss. Our model can be finetuned only with 2D annotations when the 3D annotations are not available. It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes, including newly recorded ones. Its advantages are especially noticeable on in-the-wild sequences that significantly differ from common 3D pose estimation benchmarks such as Human 3.6M and MPI-INF-3DHP. Qualitative results are available at http://gvv.mpi-inf.mpg.de/projects/PhysAware/

中文翻译：

具有物理意识的神经单眼3D人体动作捕捉

我们提出了一种新的可训练系统，用于在物理上似乎合理的无标记3D人体运动捕捉，该系统可在各种挑战性场景中实现最新的结果。与大多数用于人体运动捕捉的神经方法不同，我们将其称为物理学的方法了解物理和环境限制。它以完全可区分的方式结合了几个关键的创新，即：1.具有神经网络预测的增益的比例微分控制器，即使在快速运动的情况下也可以减少延迟； 2.明确的刚体动力学模型；以及3防止物理上难以置信的脚底穿透作为硬约束的新型优化层。我们系统的输入是2D联合关键点，以新颖的方式规范化了它们，从而减少了在训练和测试时对摄像机固有参数的依赖性。这可以实现更准确的全局翻译估计，而不会导致泛化性损失。当3D注释不可用时，只能使用2D注释微调我们的模型。它可以在各种具有挑战性的场景（包括新录制的场景）中，以交互帧速率生成平滑且符合物理原理的3D运动。在与普通3D姿态估计基准（例如Human 3.6M和MPI-INF-3DHP）显着不同的野外序列中，其优势尤其明显。定性结果可在以下网址获得：http：//gvv.mpi-inf.mpg.de/projects/PhysAware/ 当3D注释不可用时，只能使用2D注释微调我们的模型。它可以在各种具有挑战性的场景（包括新录制的场景）中，以交互帧速率生成平滑且符合物理原理的3D运动。在与普通3D姿态估计基准（例如Human 3.6M和MPI-INF-3DHP）显着不同的野外序列中，其优势尤其明显。定性结果可在以下网址获得：http：//gvv.mpi-inf.mpg.de/projects/PhysAware/ 当3D注释不可用时，只能使用2D注释微调我们的模型。它可以在各种具有挑战性的场景（包括新录制的场景）中，以交互帧速率生成平滑且符合物理原理的3D运动。在与普通3D姿态估计基准（例如Human 3.6M和MPI-INF-3DHP）显着不同的野外序列中，其优势尤其明显。定性结果可在以下网址获得：http：//gvv.mpi-inf.mpg.de/projects/PhysAware/ 在与普通3D姿态估计基准（例如Human 3.6M和MPI-INF-3DHP）显着不同的野外序列中，其优势尤其明显。定性结果可在以下网址获得：http：//gvv.mpi-inf.mpg.de/projects/PhysAware/ 在与普通3D姿态估计基准（例如Human 3.6M和MPI-INF-3DHP）显着不同的野外序列中，其优势尤其明显。定性结果可在以下网址获得：http：//gvv.mpi-inf.mpg.de/projects/PhysAware/

更新日期：2021-05-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文