Large-Scale ALS Point Cloud Segmentation via Projection-Based Context Embedding,IEEE Transactions on Geoscience and Remote Sensing

当前位置： X-MOL 学术 › IEEE Trans. Geosci. Remote Sens. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Large-Scale ALS Point Cloud Segmentation via Projection-Based Context Embedding
IEEE Transactions on Geoscience and Remote Sensing ( IF 8.2 ) Pub Date : 2024-04-22 , DOI: 10.1109/tgrs.2024.3392267
Hengming Dai ₁ , Xiangyun Hu ₁ , Jinming Zhang ₂ , Zhen Shu ₁ , Jiabo Xu ₁ , Juan Du ₁

Affiliation

Semantic segmentation of airborne laser scanning (ALS) point clouds is a valuable yet challenging task in remote sensing. When processing large-scale ALS scenes, it is necessary to partition them into smaller blocks for ease of handling. However, this partitioning introduces a challenge in capturing the ample spatial context within each block to adequately recognize the objects with a significant spatial span. This limitation becomes particularly pronounced when relying solely on the 3-D representations as the input of neural networks. To incorporate sufficient contextual information in ALS data semantic segmentation, we propose a multimodal-based segmentation framework called projection-based context embedding (PCE) in this study. PCE effectively combines the advantages of 2-D image and 3-D point-voxel representations, which are the computational efficiency and the representation capability for fine-grained 3-D geometries. The 2-D projection is used to encode a large-scale semantic context, which is computationally expensive to be obtained using only pure 3-D representation. Simultaneously, the sparse-point-voxel convolution (SPVConv) is employed to focus on learning 3-D features from a small block of points centered on the large-scale context. Finally, to fully exploit the power of each modality, the embedding disentangling (ED) strategy is proposed additionally to combine the context embedding from the 2-D image with 3-D features for the final prediction. We demonstrate the state-of-the-art performance of PCE through extensive experiments on public large-scale ALS point cloud datasets.

中文翻译：

通过基于投影的上下文嵌入进行大规模 ALS 点云分割

机载激光扫描（ALS）点云的语义分割是遥感领域一项有价值但具有挑战性的任务。当处理大规模的ALS场景时，有必要将它们分割成更小的块以便于处理。然而，这种划分带来了捕获每个块内充足的空间上下文以充分识别具有显着空间跨度的对象的挑战。当仅依赖 3D 表示作为神经网络的输入时，这种限制变得尤其明显。为了在 ALS 数据语义分割中融入足够的上下文信息，我们在本研究中提出了一种基于多模态的分割框架，称为基于投影的上下文嵌入（PCE）。 PCE有效地结合了2D图像和3D点体素表示的优点，即计算效率和细粒度3D几何形状的表示能力。 2-D 投影用于对大规模语义上下文进行编码，仅使用纯 3-D 表示来获得该语义上下文的计算成本很高。同时，采用稀疏点体素卷积 (SPVConv) 专注于从以大规模上下文为中心的小点块中学习 3D 特征。最后，为了充分利用每种模态的强大功能，还提出了嵌入解缠（ED）策略，将 2D 图像的上下文嵌入与 3D 特征相结合以进行最终预测。我们通过对公共大规模 ALS 点云数据集进行大量实验，展示了 PCE 的最先进性能。

更新日期：2024-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>