当前位置: X-MOL 学术ACM Trans. Graph. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MotioNet
ACM Transactions on Graphics  ( IF 6.2 ) Pub Date : 2020-09-04 , DOI: 10.1145/3407659
Mingyi Shi 1 , Kfir Aberman 2 , Andreas Aristidou 3 , Taku Komura 4 , Dani Lischinski 5 , Daniel Cohen-Or 6 , Baoquan Chen 7
Affiliation  

We introduce MotioNet , a deep neural network that directly reconstructs the motion of a 3D human skeleton from a monocular video. While previous methods rely on either rigging or inverse kinematics (IK) to associate a consistent skeleton with temporally coherent joint rotations, our method is the first data-driven approach that directly outputs a kinematic skeleton, which is a complete, commonly used motion representation. At the crux of our approach lies a deep neural network with embedded kinematic priors, which decomposes sequences of 2D joint positions into two separate attributes: a single, symmetric skeleton encoded by bone lengths, and a sequence of 3D joint rotations associated with global root positions and foot contact labels. These attributes are fed into an integrated forward kinematics (FK) layer that outputs 3D positions, which are compared to a ground truth. In addition, an adversarial loss is applied to the velocities of the recovered rotations to ensure that they lie on the manifold of natural joint rotations. The key advantage of our approach is that it learns to infer natural joint rotations directly from the training data rather than assuming an underlying model, or inferring them from joint positions using a data-agnostic IK solver. We show that enforcing a single consistent skeleton along with temporally coherent joint rotations constrains the solution space, leading to a more robust handling of self-occlusions and depth ambiguities.

中文翻译:

运动网

我们介绍运动网,一种深度神经网络,可直接从单目视频中重建 3D 人体骨骼的运动。虽然以前的方法依赖于绑定或反向运动学 (IK) 将一致的骨架与时间一致的关节旋转相关联,但我们的方法是第一个直接输出运动学骨架的数据驱动方法,这是一个完整的、常用的运动表示。我们方法的关键在于嵌入运动学先验的深度神经网络,它将 2D 关节位置序列分解为两个独立的属性:由骨骼长度编码的单个对称骨架,以及与全局根位置相关的 3D 关节旋转序列和脚接触标签。这些属性被馈送到输出 3D 位置的集成前向运动学 (FK) 层,将其与基本事实进行比较。此外,对抗性损失应用于恢复的旋转速度,以确保它们位于自然关节旋转的流形上。我们方法的关键优势在于它学会了直接从训练数据中推断自然关节旋转,而不是假设一个基础模型,或者使用与数据无关的 IK 求解器从关节位置推断它们。我们表明,强制执行单个一致的骨架以及时间一致的关节旋转会限制解决方案空间,从而对自遮挡和深度模糊进行更稳健的处理。我们方法的关键优势在于它学会了直接从训练数据中推断自然关节旋转,而不是假设一个基础模型,或者使用与数据无关的 IK 求解器从关节位置推断它们。我们表明,强制执行单个一致的骨架以及时间一致的关节旋转会限制解决方案空间,从而对自遮挡和深度模糊进行更稳健的处理。我们方法的关键优势在于它学会了直接从训练数据中推断自然关节旋转,而不是假设一个基础模型,或者使用与数据无关的 IK 求解器从关节位置推断它们。我们表明,强制执行单个一致的骨架以及时间一致的关节旋转会限制解决方案空间,从而对自遮挡和深度模糊进行更稳健的处理。
更新日期:2020-09-04
down
wechat
bug