当前位置: X-MOL 学术Rob. Auton. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Object-RPE: Dense 3D reconstruction and pose estimation with convolutional neural networks
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2020-11-01 , DOI: 10.1016/j.robot.2020.103632
Dinh-Cuong Hoang , Achim J. Lilienthal , Todor Stoyanov

Abstract We present an approach for recognizing objects present in a scene and estimating their full pose by means of an accurate 3D instance-aware semantic reconstruction. Our framework couples convolutional neural networks (CNNs) and a state-of-the-art dense Simultaneous Localization and Mapping (SLAM) system, ElasticFusion (Whelan et al., 2016), to achieve both high-quality semantic reconstruction as well as robust 6D pose estimation for relevant objects. We leverage the pipeline of ElasticFusion as a backbone, and propose a joint geometric and photometric error function with per-pixel adaptive weights. While the main trend in CNN-based 6D pose estimation has been to infer object’s position and orientation from single views of the scene, our approach explores performing pose estimation from multiple viewpoints, under the conjecture that combining multiple predictions can improve the robustness of an object detection system. The resulting system is capable of producing high-quality instance-aware semantic reconstructions of room-sized environments, as well as accurately detecting objects and their 6D poses. The developed method has been verified through extensive experiments on different datasets. Experimental results confirmed that the proposed system achieves improvements over state-of-the-art methods in terms of surface reconstruction and object pose prediction. Our code and video are available at https://sites.google.com/view/object-rpe .

中文翻译:

Object-RPE:使用卷积神经网络进行密集 3D 重建和姿态估计

摘要 我们提出了一种方法来识别场景中存在的对象并通过准确的 3D 实例感知语义重建来估计它们的完整姿态。我们的框架结合了卷积神经网络 (CNN) 和最先进的密集同步定位和映射 (SLAM) 系统 ElasticFusion (Whelan et al., 2016),以实现高质量的语义重建和鲁棒性相关对象的 6D 姿态估计。我们利用 ElasticFusion 的管道作为主干,并提出了一个具有每像素自适应权重的联合几何和光度误差函数。虽然基于 CNN 的 6D 姿态估计的主要趋势是从场景的单个视图推断对象的位置和方向,但我们的方法探索从多个视点进行姿态估计,推测组合多个预测可以提高目标检测系统的鲁棒性。由此产生的系统能够生成房间大小环境的高质量实例感知语义重建,以及准确检测对象及其 6D 姿势。所开发的方法已通过对不同数据集的大量实验得到验证。实验结果证实,所提出的系统在表面重建和物体姿态预测方面优于最先进的方法。我们的代码和视频可从 https://sites.google.com/view/object-rpe 获得。以及准确检测物体及其 6D 姿势。所开发的方法已通过对不同数据集的大量实验得到验证。实验结果证实,所提出的系统在表面重建和物体姿态预测方面优于最先进的方法。我们的代码和视频可从 https://sites.google.com/view/object-rpe 获得。以及准确检测物体及其 6D 姿势。所开发的方法已通过对不同数据集的大量实验得到验证。实验结果证实,所提出的系统在表面重建和物体姿态预测方面优于最先进的方法。我们的代码和视频可从 https://sites.google.com/view/object-rpe 获得。
更新日期:2020-11-01
down
wechat
bug