当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A review on object pose recovery: From 3D bounding box detectors to full 6D pose estimators
Image and Vision Computing ( IF 4.7 ) Pub Date : 2020-03-06 , DOI: 10.1016/j.imavis.2020.103898
Caner Sahin , Guillermo Garcia-Hernando , Juil Sock , Tae-Kyun Kim

Object pose recovery has gained increasing attention in the computer vision field as it has become an important problem in rapidly evolving technological areas related to autonomous driving, robotics, and augmented reality. Existing review-related studies have addressed the problem at visual level in 2D, going through the methods which produce 2D bounding boxes of objects of interest in RGB images. The 2D search space is enlarged either using the geometry information available in the 3D space along with RGB (Mono/Stereo) images, or utilizing depth data from LIDAR sensors and/or RGB-D cameras. 3D bounding box detectors, producing category-level amodal 3D bounding boxes, are evaluated on gravity aligned images, while full 6D object pose estimators are mostly tested at instance-level on the images where the alignment constraint is removed. Recently, 6D object pose estimation is tackled at the level of categories. In this paper, we present the first comprehensive and most recent review of the methods on object pose recovery, from 3D bounding box detectors to full 6D pose estimators. The methods mathematically model the problem as a classification, regression, classification & regression, template matching, and point-pair feature matching task. Based on this, a mathematical-model-based categorization of the methods is established. Datasets used for evaluating the methods are investigated with respect to the challenges, and evaluation metrics are studied. Quantitative results of experiments in the literature are analyzed to show which category of methods best performs across what types of challenges. The analyses are further extended comparing two methods, which are our own implementations, so that the outcomes from the public results are further solidified. Current position of the field is summarized regarding object pose recovery, and possible research directions are identified.



中文翻译:

对象姿态恢复的回顾:从3D边界框检测器到完整的6D姿态估计器

对象姿势恢复在计算机视觉领域越来越受到关注,因为它已成为与自动驾驶,机器人技术和增强现实相关的快速发展的技术领域中的重要问题。现有的与评论相关的研究已经通过在RGB图像中生成感兴趣对象的2D边界框的方法,在2D视觉级别解决了该问题。使用3D空间中可用的几何信息以及RGB(单/立体声)图像,或利用来自LIDAR传感器和/或RGB-D相机的深度数据来扩大2D搜索空间。在重力对齐的图像上评估产生类别级别的无模态3D边框的3D边界框检测器,而大部分6D对象姿态估计器则在去除对齐约束的图像上的实例级进行测试。最近,6D对象姿态估计在类别级别得到解决。在本文中,我们介绍了从3D边界框检测器到完整6D姿态估计器的物体姿态恢复方法的首次全面而最新的综述。这些方法在数学上将问题建模为分类,回归,分类与回归,模板匹配和点对特征匹配任务。基于此,建立了基于数学模型的方法分类。针对挑战对用于评估方法的数据集进行了研究,并对评估指标进行了研究。分析了文献中实验的定量结果,以显示在哪种类型的挑战中哪种方法最有效。通过比较两种方法(它们是我们自己的实现)进一步扩展了这些分析,从而使公共结果的结果得到进一步巩固。总结了有关对象姿态恢复的当前领域位置,并确定了可能的研究方向。

更新日期:2020-03-06
down
wechat
bug