当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Shape Prior Guided Instance Disparity Estimation for 3D Object Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 2021-04-29 , DOI: 10.1109/tpami.2021.3076678
Linghao Chen , Jiaming Sun , Yiming Xie , Siyu Zhang , Qing Shuai , Qinhong Jiang , Guofeng Zhang , Hujun Bao , Xiaowei Zhou

In this paper, we propose a novel system named Disp R-CNN for 3D object detection from stereo images. Many recent works solve this problem by first recovering point clouds with disparity estimation and then apply a 3D detector. The disparity map is computed for the entire image, which is costly and fails to leverage category-specific prior. In contrast, we design an instance disparity estimation network (iDispNet) that predicts disparity only for pixels on objects of interest and learns a category-specific shape prior for more accurate disparity estimation. To address the challenge from scarcity of disparity annotation in training, we propose to use a statistical shape model to generate dense disparity pseudo-ground-truth without the need of LiDAR point clouds, which makes our system more widely applicable. Experiments on the KITTI dataset show that, when LiDAR ground-truth is not used at training time, Disp R-CNN outperforms previous state-of-the-art methods based on stereo input by 20 percent in terms of average precision for all categories. The code and pseudo-ground-truth data are available at the project page: https://github.com/zju3dv/disprcnn.

中文翻译:


用于 3D 对象检测的形状先验引导实例视差估计



在本文中,我们提出了一种名为 Disp R-CNN 的新颖系统,用于立体图像的 3D 对象检测。最近的许多工作通过首先通过视差估计恢复点云,然后应用 3D 检测器来解决这个问题。视差图是针对整个图像计算的,这是昂贵的并且无法利用特定于类别的先验。相比之下,我们设计了一个实例视差估计网络(iDispNet),仅预测感兴趣对象上的像素视差,并先学习特定于类别的形状以实现更准确的视差估计。为了解决训练中视差标注稀缺的挑战,我们建议使用统计形状模型来生成密集视差伪地面实况,而不需要激光雷达点云,这使得我们的系统具有更广泛的适用性。 KITTI 数据集上的实验表明,当训练时不使用 LiDAR 地面实况时,Disp R-CNN 在所有类别的平均精度方面优于之前基于立体输入的最先进方法 20%。代码和伪真实数据可在项目页面获取:https://github.com/zju3dv/disprcnn。
更新日期:2021-04-29
down
wechat
bug