当前位置: X-MOL 学术Signal Process. Image Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-Level Fusion Net for hand pose estimation in hand-object interaction
Signal Processing: Image Communication ( IF 3.5 ) Pub Date : 2021-02-11 , DOI: 10.1016/j.image.2021.116196
Xiang-Bo Lin , Yi-Dan Zhou , Kuo Du , Yi Sun , Xiao-Hong Ma , Jian Lu

This work is about solving a challenging problem of estimating the full 3D hand pose when a hand interacts with an unknown object. Compared to isolated single hand pose estimation, occlusion and interference induced by the manipulated object and the clutter background bring more difficulties for this task. Our proposed Multi-Level Fusion Net focuses on extracting more effective features to overcome these disadvantages by multi-level fusion design with a new end-to-end Convolutional Neural Network (CNN) framework. It takes cropped RGBD data from a single RGBD camera at free viewpoint as input without requiring additional hand–object pre-segmentation and object or hand pre-modeling. Through extensive evaluations on public hand–object interaction dataset, we demonstrate the state-of-the-art performance of our method.



中文翻译:

多级融合网络,用于手-物体交互中的手姿估计

这项工作是要解决一个具有挑战性的问题,即当手与未知对象交互时,估算整个3D手的姿势。与孤立的单手姿势估计相比,被操纵对象和杂波背景引起的遮挡和干扰为该任务带来了更多困难。我们提出的多层融合网络致力于通过使用新的端到端卷积神经网络(CNN)框架进行多层融合设计来提取更有效的功能来克服这些缺点。它可以在自由视点上从单个RGBD摄像机中获取裁剪的RGBD数据作为输入,而无需进行额外的手-对象预分割和对象或手预建模。通过对公共手与对象交互数据集的广泛评估,我们证明了我们方法的最新性能。

更新日期:2021-02-24
down
wechat
bug