当前位置: X-MOL 学术IET Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RTL3D: real-time LIDAR-based 3D object detection with sparse CNN
IET Computer Vision ( IF 1.5 ) Pub Date : 2020-08-06 , DOI: 10.1049/iet-cvi.2019.0508
Lin Yan 1 , Kai Liu 1 , Evgeny Belyaev 2 , Meiyu Duan 1
Affiliation  

LIDAR (light detection and ranging) based real-time 3D perception is crucial for applications such as autonomous driving. However, most of the convolutional neural network (CNN) based methods are time-consuming and computation-intensive. These drawbacks are mainly attributed to the highly variable density of LIDAR point cloud and the complexity of their pipelines. To find a balance between speed and accuracy for 3D object detection from LIDAR, authors propose RTL3D, a computationally efficient Real-time LIDAR-based 3D detector. In RTL3D, an effective voxel-wise feature representation is utilised to organise unstructured point cloud. By employing a sparse feature learning network (SFLN) on voxelised 3D data, RTL3D exploits the sparsity of point cloud and down-samples 3D data into 2D. Basing on the generated 2D feature map, an optimised dense detection network (DDN) is applied to regress the oriented bounding box without relying on any predefined anchor boxes. The authors also introduce an incremental data augmentation approach which greatly improves the performance of RTL3D. Empirical experiments on public KITTI benchmark demonstrate that RTL3D achieves a competitive performance with state-of-the-art works on 3D detection task. Owning to the simplicity of its single-stage and anchor-free design, RTL3D has a real-time inference speed of 40 FPS.

中文翻译:

RTL3D:具有稀疏CNN的基于LIDAR的实时3D对象检测

基于LIDAR(光检测和测距)的实时3D感知对于自动驾驶等应用至关重要。但是,大多数基于卷积神经网络(CNN)的方法既耗时又需要大量计算。这些缺点主要归因于LIDAR点云的高度可变密度及其管道的复杂性。为了在LIDAR的3D对象检测中找到速度和精度之间的平衡,作者提出了RTL3D,这是一种计算效率高的基于LIDAR的实时3D检测器。在RTL3D中,有效的三维像素特征表示被用来组织非结构化点云。通过在体素化3D数据上使用稀疏特征学习网络(SFLN),RTL3D利用了点云的稀疏性并将3D数据下采样到2D。根据生成的2D特征图,应用了优化的密集检测网络(DDN)来回归定向边界框,而无需依赖任何预定义的锚框。作者还介绍了一种增量数据增强方法,该方法极大地提高了RTL3D的性能。在公开的KITTI基准上进行的经验实验表明,RTL3D通过在3D检测任务上的最新工作而获得了竞争优势。由于其单阶段和无锚设计的简单性,RTL3D的实时推理速度为40 FPS。在公开的KITTI基准上进行的经验实验表明,RTL3D通过在3D检测任务上的最新工作而取得了竞争优势。由于其单阶段和无锚设计的简单性,RTL3D的实时推理速度为40 FPS。在公开的KITTI基准上进行的经验实验表明,RTL3D通过在3D检测任务上的最新工作而获得了竞争优势。由于其单阶段和无锚设计的简单性,RTL3D的实时推理速度为40 FPS。
更新日期:2020-08-20
down
wechat
bug