Body Meshes as Points,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Body Meshes as Points
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-05-06 , DOI: arxiv-2105.02467
Jianfeng Zhang, Dongdong Yu, Jun Hao Liew, Xuecheng Nie, Jiashi Feng

We consider the challenging multi-person 3D body mesh estimation task in this work. Existing methods are mostly two-stage based--one stage for person localization and the other stage for individual body mesh estimation, leading to redundant pipelines with high computation cost and degraded performance for complex scenes (e.g., occluded person instances). In this work, we present a single-stage model, Body Meshes as Points (BMP), to simplify the pipeline and lift both efficiency and performance. In particular, BMP adopts a new method that represents multiple person instances as points in the spatial-depth space where each point is associated with one body mesh. Hinging on such representations, BMP can directly predict body meshes for multiple persons in a single stage by concurrently localizing person instance points and estimating the corresponding body meshes. To better reason about depth ordering of all the persons within the same scene, BMP designs a simple yet effective inter-instance ordinal depth loss to obtain depth-coherent body mesh estimation. BMP also introduces a novel keypoint-aware augmentation to enhance model robustness to occluded person instances. Comprehensive experiments on benchmarks Panoptic, MuPoTS-3D and 3DPW clearly demonstrate the state-of-the-art efficiency of BMP for multi-person body mesh estimation, together with outstanding accuracy. Code can be found at: https://github.com/jfzhang95/BMP.

中文翻译：

身体网格物体为点

我们在这项工作中考虑了具有挑战性的多人3D人体网格估计任务。现有方法大多基于两阶段-一个阶段用于人员定位，另一阶段用于个人身体网格估计，从而导致冗余管道，具有较高的计算成本，并且在复杂场景（例如，被遮挡的人的实例）上的性能下降。在这项工作中，我们提出了一个单阶段模型，即“点实体网格（BMP）”，以简化管道并提高效率和性能。特别地，BMP采用一种新方法，该方法将多个人的实例表示为空间深度空间中的点，其中每个点都与一个身体网格相关联。依靠这样的表述，通过同时定位人员实例点并估计相应的人体网格物体，BMP可以在单个阶段直接预测多个人体的人体网格物体。为了更好地说明同一场景中所有人员的深度排序，BMP设计了一个简单而有效的实例间有序深度损失，以获得与深度相关的人体网格估计。BMP还引入了一种新颖的可感知关键点的增强功能，以增强模型对被遮挡人员实例的鲁棒性。在基准Panoptic，MuPoTS-3D和3DPW上进行的全面实验清楚地证明了BMP用于多人身体网格估计的最新效率以及出色的准确性。可以在以下网址找到代码：https：//github.com/jfzhang95/BMP。BMP设计了一个简单但有效的实例间序数深度损失，以获得深度一致的身体网格估计。BMP还引入了一种新颖的可感知关键点的增强功能，以增强模型对被遮挡人员实例的鲁棒性。在基准Panoptic，MuPoTS-3D和3DPW上进行的全面实验清楚地证明了BMP用于多人身体网格估计的最新效率以及出色的准确性。可以在以下网址找到代码：https：//github.com/jfzhang95/BMP。BMP设计了一个简单但有效的实例间序数深度损失，以获得深度一致的身体网格估计。BMP还引入了新颖的可感知关键点的增强功能，以增强模型对被遮挡人员实例的鲁棒性。在基准Panoptic，MuPoTS-3D和3DPW上进行的全面实验清楚地证明了BMP用于多人身体网格估计的最新效率以及出色的准确性。可以在以下网址找到代码：https：//github.com/jfzhang95/BMP。以及出色的准确性。可以在以下网址找到代码：https：//github.com/jfzhang95/BMP。以及出色的准确性。可以在以下网址找到代码：https：//github.com/jfzhang95/BMP。

更新日期：2021-05-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文