AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2020-11-16 , DOI: 10.1007/s11263-020-01398-9
Zhe Zhang , Chunyu Wang , Weichao Qiu , Wenhu Qin , Wenjun Zeng

Occlusion is probably the biggest challenge for human pose estimation in the wild. Typical solutions often rely on intrusive sensors such as IMUs to detect occluded joints. To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views. The core of AdaFuse is to determine the point-point correspondence between two views which we solve effectively by exploring the sparsity of the heatmap representation. We also learn an adaptive fusion weight for each camera view to reflect its feature quality in order to reduce the chance that good features are undesirably corrupted by “bad” views. The fusion model is trained end-to-end with the pose estimation network, and can be directly applied to new camera configurations without additional adaptation. We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic. It outperforms the state-of-the-arts on all of them. We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints, as it provides occlusion labels for every joint in the images. The dataset and code are released at https://github.com/zhezh/adafuse-3d-human-pose .

中文翻译：

AdaFuse：用于在野外准确估计人体姿势的自适应多视图融合

遮挡可能是野外人体姿态估计的最大挑战。典型的解决方案通常依赖于 IMU 等侵入式传感器来检测咬合关节。为了使任务真正不受约束，我们提出了 AdaFuse，一种自适应多视图融合方法，它可以通过利用可见视图中的特征来增强被遮挡视图中的特征。AdaFuse 的核心是通过探索热图表示的稀疏性来确定我们有效解决的两个视图之间的点对点对应关系。我们还为每个相机视图学习了一个自适应融合权重，以反映其特征质量，以减少好的特征被“坏”视图破坏的可能性。融合模型与姿态估计网络进行端到端的训练，并且可以直接应用于新的相机配置，无需额外适配。我们在三个公共数据集上广泛评估了该方法，包括 Human3.6M、Total Capture 和 CMU Panoptic。它在所有这些方面都优于最先进的技术。我们还创建了一个大规模的合成数据集 Occlusion-Person，它允许我们对被遮挡的关节进行数值评估，因为它为图像中的每个关节提供了遮挡标签。数据集和代码发布在 https://github.com/zhezh/adafuse-3d-human-pose 。因为它为图像中的每个关节提供遮挡标签。数据集和代码发布在 https://github.com/zhezh/adafuse-3d-human-pose 。因为它为图像中的每个关节提供遮挡标签。数据集和代码发布在 https://github.com/zhezh/adafuse-3d-human-pose 。

更新日期：2020-11-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>