当前位置: X-MOL 学术IEEE Trans. Multimedia › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Novel Depth and Color Feature Fusion Framework for 6D Object Pose Estimation
IEEE Transactions on Multimedia ( IF 7.3 ) Pub Date : 2020-06-11 , DOI: 10.1109/tmm.2020.3001533
Guangliang Zhou , Yi Yan , Deming Wang , Qijun Chen

This paper aims to solve the problem of estimating the 6D pose of an object under occlusion using RGB-D images. Most existing methods typically use the information of color and depth images separately to make predictions, which limits their performances in the presence of occlusion. Instead, we propose a pipeline to effectively fuse color and depth information and perform region-level pose estimation. Our method first uses a CNN to extract the color features, and then we obtain the fusion features by combining the color features into the point cloud. Unlike existing methods, the fusion features are in the form of point sets instead of feature maps. We further use a PointNet++-like network to process the fusion features, obtaining several region-level features. Each region-level feature can predict a pose with confidence. The pose with the highest confidence is chosen as the final output. Experiments show that the proposed method outperforms the state-of-the-art methods on both the LINEMOD and Occlusion LINEMOD datasets, indicating that the proposed pipeline can obtain accurate pose estimation results and is robust to occlusion.

中文翻译:

用于 6D 对象姿态估计的新型深度和颜色特征融合框架

本文旨在解决使用RGB-D图像估计被遮挡物体的6D姿态的问题。大多数现有方法通常分别使用颜色和深度图像的信息进行预测,这限制了它们在存在遮挡的情况下的性能。相反,我们提出了一种有效融合颜色和深度信息并执行区域级姿态估计的管道。我们的方法首先使用 CNN 来提取颜色特征,然后我们通过将颜色特征组合到点云中来获得融合特征。与现有方法不同,融合特征是以点集的形式而不是特征图的形式。我们进一步使用类似 PointNet++ 的网络来处理融合特征,获得多个区域级特征。每个区域级特征都可以自信地预测姿势。选择具有最高置信度的姿势作为最终输出。实验表明,所提出的方法在LINEMOD和遮挡LINEMOD数据集上均优于最新方法,表明所提出的管线可以获得准确的姿态估计结果,并且对遮挡具有鲁棒性。
更新日期:2020-06-11
down
wechat
bug