当前位置: X-MOL 学术Int. J. Remote Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Structure extraction in urbanized aerial images from a single view using a CNN-based approach
International Journal of Remote Sensing ( IF 3.4 ) Pub Date : 2020-06-10 , DOI: 10.1080/01431161.2020.1767821
J. A. de Jesús Osuna-Coutiño 1 , José Martinez-Carranza 1, 2
Affiliation  

ABSTRACT High-Level Structure (HLS) extraction in aerial images consists of recognizing Three-Dimensional (3D) elements on human-made surfaces (objects, buildings, ground, etc.). There are several approaches to HLS extraction in aerial images. However, most of these approaches are based on processing two or more images captured from different camera views or on processing 3D data in the form of point clouds extracted from the camera images. In general, 3D point cloud and multiple view approaches have good performance for certain scenes with video sequences or image sequences, but they need sufficient parallax in order to guarantee accuracy. To address this problem, an alternative is to process a single image seeking to interpret areas of the images where the human-made structure may be observed, thus removing parallax dependency, but adding the challenge of having to interpret image ambiguities correctly. Motivated by the latter, this work presents the results of a novel method for HLS extraction from a single image. Our interest is the buildings structures extraction in urbanized aerial images. For that, our method has six steps. First, we use a new Convolutional Neural Network (CNN) architecture to recognize the labels (tree, roof, and floor) in the input image. Second, we use a CNN to predict the depth. Third, we divide the input image using a superpixel technique. Fourth, we segment the superpixels with its majority label. Fifth, we recognize the structures using a proposed connection analysis that connects the adjacent superpixels with equal labels (tree, roof, and floor). Finally, we use a geometric analysis with the depth prediction of the labels recognized that extracts the 3D shape of the building structure.

中文翻译:

使用基于 CNN 的方法从单一视图中提取城市化航拍图像的结构

摘要 航拍图像中的高级结构 (HLS) 提取包括识别人造表面(物体、建筑物、地面等)上的三维 (3D) 元素。有几种方法可以在航拍图像中提取 HLS。然而,这些方法中的大多数都基于处理从不同相机视图捕获的两个或多个图像,或者处理以从相机图像中提取的点云形式的 3D 数据。一般而言,3D 点云和多视图方法对于具有视频序列或图像序列的某些场景具有良好的性能,但它们需要足够的视差以保证准确性。为了解决这个问题,另一种方法是处理单个图像,试图解释可以观察到人造结构的图像区域,从而消除视差依赖性,但增加了必须正确解释图像歧义的挑战。受后者的启发,这项工作提出了一种从单个图像中提取 HLS 的新方法的结果。我们的兴趣是城市化航拍图像中的建筑物结构提取。为此,我们的方法有六个步骤。首先,我们使用新的卷积神经网络 (CNN) 架构来识别输入图像中的标签(树、屋顶和地板)。其次,我们使用 CNN 来预测深度​​。第三,我们使用超像素技术分割输入图像。第四,我们用它的多数标签分割超像素。第五,我们使用建议的连接分析识别结构,该连接分析将相邻的超像素与相同的标签(树、屋顶和地板)连接起来。最后,
更新日期:2020-06-10
down
wechat
bug