An end-to-end framework for unconstrained monocular 3D hand pose estimation,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An end-to-end framework for unconstrained monocular 3D hand pose estimation
Pattern Recognition ( IF 7.5 ) Pub Date : 2021-02-16 , DOI: 10.1016/j.patcog.2021.107892
Sanjeev Sharma , Shaoli Huang

This work addresses the challenging problem of unconstrained 3D hand pose estimation using monocular RGB images. Most of the existing approaches assume some prior knowledge of hand (such as hand locations and side information) is available for 3D hand pose estimation. This restricts their use in unconstrained environments. Therefore, we present an end-to-end framework that robustly predicts hand prior information and accurately infers 3D hand pose by learning ConvNet models while only using keypoint annotations. To enhance the hand detector’s robustness, we propose a novel keypoint-based method to simultaneously predict hand regions and side labels, unlike existing methods that suffer from background color confusion caused by using segmentation or detection-based technology. Moreover, inspired by the human hand’s biological structure, we introduce two geometric constraints directly into the 3D coordinates prediction that further improves its performance. Experimental results show that our proposed framework outperforms the state-of-art methods on standard benchmark datasets while providing robust predictions.

中文翻译：

无约束单眼3D手姿势估计的端到端框架

这项工作解决了使用单眼RGB图像进行无约束3D手部姿势估计这一具有挑战性的问题。大多数现有方法假定手的一些先验知识（例如手的位置和侧面信息）可用于3D手姿势估计。这限制了它们在不受限制的环境中的使用。因此，我们提出了一个端到端框架，该框架可以通过学习ConvNet来可靠地预测手的先验信息并准确地推断3D手的姿势。模型，而仅使用关键点注释。为了增强手持检测器的鲁棒性，我们提出了一种新颖的基于关键点的方法来同时预测手持区域和侧面标签，这与现有的方法不同，该方法会遭受因使用分段或基于检测的技术而引起的背景色混淆。此外，受人手生物结构的启发，我们将两个几何约束直接引入3D坐标预测中，从而进一步提高了其性能。实验结果表明，我们提出的框架优于标准基准数据集上的最新方法，同时提供了可靠的预测。

更新日期：2021-02-26

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11