3D hand mesh reconstruction from a monocular RGB image,The Visual Computer

当前位置： X-MOL 学术 › Vis. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

3D hand mesh reconstruction from a monocular RGB image
The Visual Computer ( IF 3.0 ) Pub Date : 2020-07-14 , DOI: 10.1007/s00371-020-01908-3
Hao Peng , Chuhua Xian , Yunbo Zhang

Most of the existing methods for 3D hand analysis based on RGB images mainly focus on estimating hand keypoints or poses, which cannot capture geometric details of the 3D hand shape. In this work, we propose a novel method to reconstruct a 3D hand mesh from a single monocular RGB image. Different from current parameter-based or pose-based methods, our proposed method directly estimates the 3D hand mesh based on graph convolution neural network (GCN). Our network consists of two modules: the hand localization and mask generation module, and the 3D hand mesh reconstruction module. The first module, which is a VGG16-based network, is applied to localize the hand region in the input image and generate the binary mask of the hand. The second module takes the high-order features from the first and uses a GCN-based network to estimate the coordinates of each vertex of the hand mesh and reconstruct the 3D hand shape. To achieve better accuracy, a novel loss based on the differential properties of the discrete mesh is proposed. We also use professional software to create a large synthetic dataset that contains both ground truth 3D hand meshes and poses for training. To handle the real-world data, we use the CycleGAN network to transform the data domain of real-world images to that of our synthesis dataset. We demonstrate that our method can produce accurate 3D hand mesh and achieve an efficient performance for real-time applications.

中文翻译：

从单目 RGB 图像重建 3D 手部网格

现有的基于 RGB 图像的 3D 手部分析方法大多侧重于估计手部关键点或姿势，无法捕捉 3D 手部形状的几何细节。在这项工作中，我们提出了一种从单个单目 RGB 图像重建 3D 手部网格的新方法。与当前基于参数或基于姿势的方法不同，我们提出的方法直接基于图卷积神经网络 (GCN) 估计 3D 手部网格。我们的网络由两个模块组成：手部定位和掩模生成模块，以及 3D 手部网格重建模块。第一个模块是基于 VGG16 的网络，用于定位输入图像中的手部区域并生成手部的二值掩码。第二个模块从第一个模块中获取高阶特征，并使用基于 GCN 的网络来估计手部网格每个顶点的坐标并重建 3D 手部形状。为了获得更好的精度，提出了一种基于离散网格微分属性的新损失。我们还使用专业软件创建一个大型合成数据集，其中包含地面实况 3D 手部网格和用于训练的姿势。为了处理真实世界的数据，我们使用 CycleGAN 网络将真实世界图像的数据域转换为我们合成数据集的数据域。我们证明了我们的方法可以生成准确的 3D 手部网格并实现实时应用的高效性能。提出了一种基于离散网格微分属性的新损失。我们还使用专业软件创建一个大型合成数据集，其中包含地面实况 3D 手部网格和用于训练的姿势。为了处理真实世界的数据，我们使用 CycleGAN 网络将真实世界图像的数据域转换为我们合成数据集的数据域。我们证明了我们的方法可以生成准确的 3D 手部网格并实现实时应用的高效性能。提出了一种基于离散网格微分属性的新损失。我们还使用专业软件创建一个大型合成数据集，其中包含地面实况 3D 手部网格和用于训练的姿势。为了处理真实世界的数据，我们使用 CycleGAN 网络将真实世界图像的数据域转换为我们合成数据集的数据域。我们证明了我们的方法可以生成准确的 3D 手部网格并实现实时应用的高效性能。

更新日期：2020-07-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文