当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-11-13 , DOI: 10.1109/tip.2019.2952201
Wentao Bao , Bin Xu , Zhenzhong Chen

Monocular 3D object detection has the merit of low cost and can be served as an auxiliary module for autonomous driving system, becoming a growing concern in recent years. In this paper, we present a monocular 3D object detection method with feature enhancement networks, which we call MonoFENet. Specifically, with the estimated disparity from the input monocular image, the features of both the 2D and 3D streams can be enhanced and utilized for accurate 3D localization. For the 2D stream, the input image is used to generate 2D region proposals as well as to extract appearance features. For the 3D stream, the estimated disparity is transformed into 3D dense point cloud, which is then enhanced by the associated front view maps. With the RoI Mean Pooling layer, 3D geometric features of RoI point clouds are further enhanced by the proposed point feature enhancement (PointFE) network. The region-wise features of image and point cloud are fused for the final 2D and 3D bounding boxes regression. The experimental results on the KITTI benchmark reveal that our method can achieve state-of-the-art performance for monocular 3D object detection.

中文翻译:


MonoFENet:具有特征增强网络的单目 3D 对象检测。



单目3D物体检测具有成本低的优点,可以作为自动驾驶系统的辅助模块,近年来受到越来越多的关注。在本文中,我们提出了一种具有特征增强网络的单目 3D 物体检测方法,我们称之为 MonoFENet。具体来说,通过输入单目图像的估计视差,可以增强 2D 和 3D 流的特征并利用其进行精确的 3D 定位。对于 2D 流,输入图像用于生成 2D 区域建议以及提取外观特征。对于 3D 流,估计的视差被转换为 3D 密集点云,然后通过相关的前视图图进行增强。通过 RoI Mean Pooling 层,所提出的点特征增强 (PointFE) 网络进一步增强了 RoI 点云的 3D 几何特征。图像和点云的区域特征被融合以用于最终的 2D 和 3D 边界框回归。 KITTI 基准测试的实验结果表明,我们的方法可以实现单目 3D 物体检测的最先进的性能。
更新日期:2020-04-22
down
wechat
bug