当前位置: X-MOL 学术ISPRS J. Photogramm. Remote Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A multi-scale fully convolutional network for semantic labeling of 3D point clouds
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 12.7 ) Pub Date : 2018-05-16 , DOI: 10.1016/j.isprsjprs.2018.03.018
Mohammed Yousefhussien , David J. Kelbe , Emmett J. Ientilucci , Carl Salvaggio

When classifying point clouds, a large amount of time is devoted to the process of engineering a reliable set of features which are then passed to a classifier of choice. Generally, such features – usually derived from the 3D-covariance matrix – are computed using the surrounding neighborhood of points. While these features capture local information, the process is usually time-consuming and requires the application at multiple scales combined with contextual methods in order to adequately describe the diversity of objects within a scene. In this paper we present a novel 1D-fully convolutional network that consumes terrain-normalized points directly with the corresponding spectral data (if available) to generate point-wise labeling while implicitly learning contextual features in an end-to-end fashion. This unique approach allows us to operate on unordered point sets with varying densities, without relying on expensive hand-crafted features; thus reducing the time needed for testing by an order of magnitude over existing approaches. Our method uses only the 3D-coordinates and three corresponding spectral features for each point. Spectral features may either be extracted from 2D-georeferenced images, as shown here for Light Detection and Ranging (LiDAR) point clouds, or extracted directly for passive-derived point clouds, i.e. from multiple-view imagery. We train our network by splitting the data into square regions and use a pooling layer that respects the permutation-invariance of the input points. Evaluated using the ISPRS 3D Semantic Labeling Contest, our method scored second place with an overall accuracy of 81.6%. We ranked third place with a mean F1-score of 63.32%, surpassing the F1-score of the method with highest accuracy by 1.69%. In addition to labeling 3D-point clouds, we also show that our method can be easily extended to 2D-semantic segmentation tasks, with promising initial results.



中文翻译:

用于3D点云语义标记的多尺度全卷积网络

在对点云进行分类时,大量的时间用于设计一组可靠的功能,然后将这些功能传递给所选的分类器。通常,通常使用3D协方差矩阵得出的此类特征是使用周围的点邻域来计算的。尽管这些功能捕获了本地信息,但此过程通常很耗时,并且需要将多个应用程序与上下文方法相结合才能充分描述场景中对象的多样性。在本文中,我们提出了一种新颖的一维全卷积网络,该网络直接消耗地形归一化的点以及相应的光谱数据(如果有)以生成逐点标记,同时以端到端的方式隐式学习上下文特征。这种独特的方法使我们可以在密度不定的无序点集上进行操作,而不必依赖昂贵的手工功能;因此,与现有方法相比,测试所需的时间减少了一个数量级。我们的方法仅对每个点使用3D坐标和三个相应的光谱特征。光谱特征可以从2D地理参考图像中提取,如此处针对光检测和测距(LiDAR)点云所示,也可以直接从无源衍生点云中提取,。来自多视图图像。我们通过将数据划分为正方形区域来训练我们的网络,并使用尊重输入点的排列不变性的池化层。通过ISPRS 3D语义标签竞赛进行的评估,我们的方法以81.6%的总体准确率获得了第二名。我们以63.32%的平均F1得分排名第三,以最高的准确性超过方法的F1得分1.69%。除了标记3D点云外,我们还表明,我们的方法可以轻松扩展到2D语义分割任务,并具有令人鼓舞的初步结果。

更新日期:2018-05-16
down
wechat
bug