Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild
arXiv - CS - Machine Learning Pub Date : 2020-03-17 , DOI: arxiv-2003.07581
Umar Iqbal and Pavlo Molchanov and Jan Kautz

One major challenge for monocular 3D human pose estimation in-the-wild is the acquisition of training data that contains unconstrained images annotated with accurate 3D poses. In this paper, we address this challenge by proposing a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data, which can be acquired easily in in-the-wild environments. We propose a novel end-to-end learning framework that enables weakly-supervised training using multi-view consistency. Since multi-view consistency is prone to degenerated solutions, we adopt a 2.5D pose representation and propose a novel objective function that can only be minimized when the predictions of the trained model are consistent and plausible across all camera views. We evaluate our proposed approach on two large scale datasets (Human3.6M and MPII-INF-3DHP) where it achieves state-of-the-art performance among semi-/weakly-supervised methods.

中文翻译：

通过野外多视图图像弱监督 3D 人体姿势学习

野外单眼 3D 人体姿态估计的一项主要挑战是获取训练数据，其中包含用准确 3D 姿态标注的无约束图像。在本文中，我们通过提出一种弱监督方法来应对这一挑战，该方法不需要 3D 注释并学习从未标记的多视图数据中估计 3D 姿势，这些数据可以在野外环境中轻松获取。我们提出了一种新颖的端到端学习框架，该框架可以使用多视图一致性进行弱监督训练。由于多视图一致性容易出现退化的解决方案，我们采用 2.5D 姿态表示并提出了一个新的目标函数，只有当训练模型的预测在所有相机视图中一致且合理时才能最小化。

更新日期：2020-03-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文