当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Single-shot 3D multi-person pose estimation in complex images
Pattern Recognition ( IF 8 ) Pub Date : 2021-04-01 , DOI: 10.1016/j.patcog.2020.107534
Abdallah Benzine , Bertrand Luvison , Quoc Cuong Pham , Catherine Achard

Abstract In this paper, we propose a new single shot method for multi-person 3D human pose estimation in complex images. The model jointly learns to locate the human joints in the image, to estimate their 3D coordinates and to group these predictions into full human skeletons. The proposed method deals with a variable number of people and does not need bounding boxes to estimate the 3D poses. It leverages and extends the Stacked Hourglass Network and its multi-scale feature learning to manage multi-person situations. Thus, we exploit a robust 3D human pose formulation to fully describe several 3D human poses even in case of strong occlusions or crops. Then, joint grouping and human pose estimation for an arbitrary number of people are performed using the associative embedding method. Our approach significantly outperforms the state of the art on the challenging CMU Panoptic and a previous single shot method on the MuPoTS-3D dataset. Furthermore, it leads to good results on the complex and synthetic images from the newly proposed JTA Dataset.

中文翻译:

复杂图像中的单次 3D 多人姿态估计

摘要 在本文中,我们提出了一种新的单次拍摄方法,用于复杂图像中的多人 3D 人体姿态估计。该模型共同学习定位图像中的人体关节,估计它们的 3D 坐标并将这些预测分组为完整的人体骨骼。所提出的方法处理可变数量的人,并且不需要边界框来估计 3D 姿势。它利用并扩展 Stacked Hourglass Network 及其多尺度特征学习来管理多人情况。因此,我们利用强大的 3D 人体姿势公式来完全描述几个 3D 人体姿势,即使在强烈遮挡或裁剪的情况下也是如此。然后,使用关联嵌入方法对任意数量的人进行联合分组和人体姿态估计。我们的方法在具有挑战性的 CMU Panoptic 和之前在 MuPoTS-3D 数据集上的单次拍摄方法上明显优于最先进的方法。此外,它在来自新提出的 JTA 数据集的复杂和合成图像上产生了良好的结果。
更新日期:2021-04-01
down
wechat
bug