International Journal of Applied Earth Observation and Geoinformation ( IF 7.5 ) Pub Date : 2022-08-22 , DOI: 10.1016/j.jag.2022.102987 Xiaoling Jiang, Yinyin Li, Tao Jiang, Junhao Xie, Yilong Wu, Qianfeng Cai, Jinhui Jiang, Jiaming Xu, Hui Zhang
The data-complete and detail-correct road network information serves as important evidence in numerous transportation-associated applications. Regular and rapid road network inventory updating is significantly necessary and meaningful to provide better services. Remote sensing images, due to their advantageous overlooking earth observation properties, have been widely used to assist in the road network interpretation tasks. However, it is still an open issue to accurately separate the road contents from the surrounding land covers in the remote sensing image with good connectivity and integrality because of the remarkably challengeable conditions of roads. In this regard, we develop a pyramidal deformable vision transformer architecture, termed as RoadFormer, to extract road networks with remote sensing images. Specifically, designed by a multi-context patch embedding scheme, a higher-quality token embedding can be obtained by adopting a multi-range, multi-view context observation strategy. Furthermore, formulated with a deformable transformer architecture, the semantic-relevant features can be focused on in a sparse global manner, which effectively promotes the feature representation quality and robustness. The proposed RoadFormer is elaborately evaluated on three large-scale road network extraction datasets. Quantitative assessments show that the RoadFormer achieves an overall performance of 0.8886 and 0.9407 with respect to the intersection over union (IoU) and F1-score metrics. In addition, contrastive evaluations also convince the promising potentiality and outstanding superiority of the RoadFormer for interpreting the road sections of varying circumstances under diverse challenging image scenarios.
中文翻译:
RoadFormer:用于道路网络提取与遥感图像的金字塔形可变形视觉转换器
数据完整且细节正确的道路网络信息是众多交通相关应用的重要证据。定期和快速的路网库存更新对于提供更好的服务是非常必要和有意义的。遥感图像由于其优越的俯瞰地球观测特性,已被广泛用于辅助路网解释任务。然而,由于道路条件非常具有挑战性,如何在遥感图像中以良好的连通性和完整性准确地将道路内容与周围土地覆盖区分开来仍然是一个悬而未决的问题。在这方面,我们开发了一种金字塔形可变形视觉转换器架构,称为 RoadFormer,以提取具有遥感图像的道路网络。具体来说,通过多上下文补丁嵌入方案设计,采用多范围、多视图上下文观察策略可以获得更高质量的令牌嵌入。此外,采用可变形的转换器架构,语义相关的特征可以以稀疏的全局方式集中,有效地提高了特征表示的质量和鲁棒性。所提出的 RoadFormer 在三个大型道路网络提取数据集上进行了精心评估。定量评估表明,RoadFormer 在联合交叉 (IoU) 和 F 方面的整体性能分别为 0.8886 和 0.9407 采用可变形的 Transformer 架构,语义相关的特征可以以稀疏的全局方式聚焦,有效地提高了特征表示的质量和鲁棒性。所提出的 RoadFormer 在三个大型道路网络提取数据集上进行了精心评估。定量评估表明,RoadFormer 在联合交叉 (IoU) 和 F 方面的整体性能分别为 0.8886 和 0.9407 采用可变形的 Transformer 架构,语义相关的特征可以以稀疏的全局方式聚焦,有效地提高了特征表示的质量和鲁棒性。所提出的 RoadFormer 在三个大型道路网络提取数据集上进行了精心评估。定量评估表明,RoadFormer 在联合交叉 (IoU) 和 F 方面的整体性能分别为 0.8886 和 0.94071 - 分数指标。此外,对比评估还证明了 RoadFormer 在各种具有挑战性的图像场景下解释不同情况的路段的潜力和突出优势。