显示样式:     当前期刊: ISPRS Journal of Photogrammetry and Remote Sensing    加入关注    导出
  • Land cover mapping at very high resolution with rotation equivariant CNNs: Towards small yet accurate models
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2018-02-19
    Diego Marcos, Michele Volpi, Benjamin Kellenberger, Devis Tuia

    In remote sensing images, the absolute orientation of objects is arbitrary. Depending on an object’s orientation and on a sensor’s flight path, objects of the same semantic class can be observed in different orientations in the same image. Equivariance to rotation, in this context understood as responding with a rotated semantic label map when subject to a rotation of the input image, is therefore a very desirable feature, in particular for high capacity models, such as Convolutional Neural Networks (CNNs). If rotation equivariance is encoded in the network, the model is confronted with a simpler task and does not need to learn specific (and redundant) weights to address rotated versions of the same object class. In this work we propose a CNN architecture called Rotation Equivariant Vector Field Network (RotEqNet) to encode rotation equivariance in the network itself. By using rotating convolutions as building blocks and passing only the values corresponding to the maximally activating orientation throughout the network in the form of orientation encoding vector fields, RotEqNet treats rotated versions of the same object with the same filter bank and therefore achieves state-of-the-art performances even when using very small architectures trained from scratch. We test RotEqNet in two challenging sub-decimeter resolution semantic labeling problems, and show that we can perform better than a standard CNN while requiring one order of magnitude less parameters.

  • Pan-sharpening via deep metric learning
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2018-02-17
    Yinghui Xing, Min Wang, Shuyuan Yang, Licheng Jiao

    Neighbors Embedding based pansharpening methods have received increasing interests in recent years. However, image patches do not strictly follow the similar structure in the shallow MultiSpectral (MS) and PANchromatic (PAN) image spaces, consequently leading to a bias to the pansharpening. In this paper, a new deep metric learning method is proposed to learn a refined geometric multi-manifold neighbor embedding, by exploring the hierarchical features of patches via multiple nonlinear deep neural networks. First of all, down-sampled PAN images from different satellites are divided into a large number of training image patches and are then grouped coarsely according to their shallow geometric structures. Afterwards, several Stacked Sparse AutoEncoders (SSAE) with similar structures are separately constructed and trained by these grouped patches. In the fusion, image patches of the source PAN image pass through the networks to extract features for formulating a deep distance metric and thus deriving their geometric labels. Then, patches with the same geometric labels are grouped to form geometric manifolds. Finally, the assumption that MS patches and PAN patches form the same geometric manifolds in two distinct spaces, is cast on geometric groups to formulate geometric multi-manifold embedding for estimating high resolution MS image patches. Some experiments are taken on datasets acquired by different satellites. The experimental results demonstrate that our proposed method can obtain better fusion results than its counterparts in terms of visual results and quantitative evaluations.

  • One-two-one networks for compression artifacts reduction in remote sensing
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2018-02-17
    Baochang Zhang, Jiaxin Gu, Chen Chen, Jungong Han, Xiangbo Su, Xianbin Cao, Jianzhuang Liu

    Compression artifacts reduction (CAR) is a challenging problem in the field of remote sensing. Most recent deep learning based methods have demonstrated superior performance over the previous hand-crafted methods. In this paper, we propose an end-to-end one-two-one (OTO) network, to combine different deep models, i.e., summation and difference models, to solve the CAR problem. Particularly, the difference model motivated by the Laplacian pyramid is designed to obtain the high frequency information, while the summation model aggregates the low frequency information. We provide an in-depth investigation into our OTO architecture based on the Taylor expansion, which shows that these two kinds of information can be fused in a nonlinear scheme to gain more capacity of handling complicated image compression artifacts, especially the blocking effect in compression. Extensive experiments are conducted to demonstrate the superior performance of the OTO networks, as compared to the state-of-the-arts on remote sensing datasets and other benchmark datasets. The source code will be available here: https://github.com/bczhangbczhang/.

  • 更新日期:2018-02-07
  • Geoinformatics for the conservation and promotion of cultural heritage in support of the UN Sustainable Development Goals
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2018-01-12
    Wen Xiao, Jon Mills, Gabriele Guidi, Pablo Rodríguez-Gonzálvez, Sara Gonizzi Barsanti, Diego González-Aguilera

    Cultural Heritage (CH) is recognised as being of historical, social, and anthropological value and is considered as an enabler of sustainable development. As a result, it is included in the United Nations’ Sustainable Development Goals (SDGs) 11 and 8. SDG 11.4 emphasises the protection and safeguarding of heritage, and SDG 8.9 aims to promote sustainable tourism that creates jobs and promotes local culture and products. This paper briefly reviews the geoinformatics technologies of photogrammetry, remote sensing, and spatial information science and their application to CH. Detailed aspects of CH-related SDGs, comprising protection and safeguarding, as well as the promotion of sustainable tourism are outlined. Contributions of geoinformatics technologies to each of these aspects are then identified and analysed. Case studies in both developing and developed countries, supported by funding directed at the UN SDGs, are presented to illustrate the challenges and opportunities of geoinformatics to enhance CH protection and to promote sustainable tourism. The potential and impact of geoinformatics for the measurement of official SDG indicators, as well as UNESCO’s Culture for Development Indicators, are discussed. Based on analysis of the review and the presented case studies, it is concluded that the contribution of geoinformatics to the achievement of CH SDGs is necessary, significant and evident. Moreover, following the UNESCO initiative to introduce CH into the sustainable development agenda and related ICOMOS action plan, the concept of Sustainable Cultural Heritage is defined, reflecting the significance of CH to the United Nations’ ambition to “transform our world”.

  • A deep learning framework for remote sensing image registration ☆
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2018-01-05
    Shuang Wang, Dou Quan, Xuefeng Liang, Mengdan Ning, Yanhe Guo, Licheng Jiao

    We propose an effective deep neural network aiming at remote sensing image registration problem. Unlike conventional methods doing feature extraction and feature matching separately, we pair patches from sensed and reference images, and then learn the mapping directly between these patch-pairs and their matching labels for later registration. This end-to-end architecture allows us to optimize the whole processing (learning mapping function) through information feedback when training the network, which is lacking in conventional methods. In addition, to alleviate the small data issue of remote sensing images for training, our proposal introduces a self-learning by learning the mapping function using images and their transformed copies. Moreover, we apply a transfer learning to reduce the huge computation cost in the training stage. It does not only speed up our framework, but also get extra performance gains. The comprehensive experiments conducted on seven sets of remote sensing images, acquired by Radarsat, SPOT and Landsat, show that our proposal improves the registration accuracy up to 2.4–53.7%.

  • Semantic labeling in very high resolution images via a self-cascaded convolutional neural network
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-12-21
    Yongcheng Liu, Bin Fan, Lingfeng Wang, Jun Bai, Shiming Xiang, Chunhong Pan

    Semantic labeling for very high resolution (VHR) images in urban areas, is of significant importance in a wide range of remote sensing applications. However, many confusing manmade objects and intricate fine-structured objects make it very difficult to obtain both coherent and accurate labeling results. For this challenging task, we propose a novel deep model with convolutional neural networks (CNNs), i.e., an end-to-end self-cascaded network (ScasNet). Specifically, for confusing manmade objects, ScasNet improves the labeling coherence with sequential global-to-local contexts aggregation. Technically, multi-scale contexts are captured on the output of a CNN encoder, and then they are successively aggregated in a self-cascaded manner. Meanwhile, for fine-structured objects, ScasNet boosts the labeling accuracy with a coarse-to-fine refinement strategy. It progressively refines the target objects using the low-level features learned by CNN’s shallow layers. In addition, to correct the latent fitting residual caused by multi-feature fusion inside ScasNet, a dedicated residual correction scheme is proposed. It greatly improves the effectiveness of ScasNet. Extensive experimental results on three public datasets, including two challenging benchmarks, show that ScasNet achieves the state-of-the-art performance.

  • Skipping the real world: Classification of PolSAR images without explicit feature extraction
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-12-13
    Ronny Hänsch, Olaf Hellwich

    The typical processing chain for pixel-wise classification from PolSAR images starts with an optional preprocessing step (e.g. speckle reduction), continues with extracting features projecting the complex-valued data into the real domain (e.g. by polarimetric decompositions) which are then used as input for a machine-learning based classifier, and ends in an optional postprocessing (e.g. label smoothing). The extracted features are usually hand-crafted as well as preselected and represent (a somewhat arbitrary) projection from the complex to the real domain in order to fit the requirements of standard machine-learning approaches such as Support Vector Machines or Artificial Neural Networks. This paper proposes to adapt the internal node tests of Random Forests to work directly on the complex-valued PolSAR data, which makes any explicit feature extraction obsolete. This approach leads to a classification framework with a significantly decreased computation time and memory footprint since no image features have to be computed and stored beforehand. The experimental results on one fully-polarimetric and one dual-polarimetric dataset show that, despite the simpler approach, accuracy can be maintained (decreased by only less than 2% 2 % for the fully-polarimetric dataset) or even improved (increased by roughly 9% 9 % for the dual-polarimetric dataset).

  • A new deep convolutional neural network for fast hyperspectral image classification
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-12-06
    M.E. Paoletti, J.M. Haut, J. Plaza, A. Plaza

    Artificial neural networks (ANNs) have been widely used for the analysis of remotely sensed imagery. In particular, convolutional neural networks (CNNs) are gaining more and more attention in this field. CNNs have proved to be very effective in areas such as image recognition and classification, especially for the classification of large sets composed by two-dimensional images. However, their application to multispectral and hyperspectral images faces some challenges, especially related to the processing of the high-dimensional information contained in multidimensional data cubes. This results in a significant increase in computation time. In this paper, we present a new CNN architecture for the classification of hyperspectral images. The proposed CNN is a 3-D network that uses both spectral and spatial information. It also implements a border mirroring strategy to effectively process border areas in the image, and has been efficiently implemented using graphics processing units (GPUs). Our experimental results indicate that the proposed network performs accurately and efficiently, achieving a reduction of the computation time and increasing the accuracy in the classification of hyperspectral images when compared to other traditional ANN techniques.

  • Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-11-23
    Nicolas Audebert, Bertrand Le Saux, Sébastien Lefèvre

    In this work, we investigate various methods to deal with semantic labeling of very high resolution multi-modal remote sensing data. Especially, we study how deep fully convolutional networks can be adapted to deal with multi-modal and multi-scale remote sensing data for semantic labeling. Our contributions are threefold: (a) we present an efficient multi-scale approach to leverage both a large spatial context and the high resolution data, (b) we investigate early and late fusion of Lidar and multispectral data, (c) we validate our methods on two public datasets with state-of-the-art results. Our results indicate that late fusion make it possible to recover errors steaming from ambiguous data, while early fusion allows for better joint-feature learning but at the cost of higher sensitivity to missing data.

  • Near real-time shadow detection and removal in aerial motion imagery application
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-11-22
    Guilherme F. Silva, Grace B. Carneiro, Ricardo Doth, Leonardo A. Amaral, Dario F.G. de Azevedo

    This work presents a method to automatically detect and remove shadows in urban aerial images and its application in an aerospace remote monitoring system requiring near real-time processing. Our detection method generates shadow masks and is accelerated by GPU programming. To obtain the shadow masks, we converted images from RGB to CIELCh model, calculated a modified Specthem ratio, and applied multilevel thresholding. Morphological operations were used to reduce shadow mask noise. The shadow masks are used in the process of removing shadows from the original images using the illumination ratio of the shadow/non-shadow regions. We obtained shadow detection accuracy of around 93% and shadow removal results comparable to the state-of-the-art while maintaining execution time under real-time constraints.

  • Object Scene Flow
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-11-22
    Moritz Menze, Christian Heipke, Andreas Geiger

    This work investigates the estimation of dense three-dimensional motion fields, commonly referred to as scene flow. While great progress has been made in recent years, large displacements and adverse imaging conditions as observed in natural outdoor environments are still very challenging for current approaches to reconstruction and motion estimation. In this paper, we propose a unified random field model which reasons jointly about 3D scene flow as well as the location, shape and motion of vehicles in the observed scene. We formulate the problem as the task of decomposing the scene into a small number of rigidly moving objects sharing the same motion parameters. Thus, our formulation effectively introduces long-range spatial dependencies which commonly employed local rigidity priors are lacking. Our inference algorithm then estimates the association of image segments and object hypotheses together with their three-dimensional shape and motion. We demonstrate the potential of the proposed approach by introducing a novel challenging scene flow benchmark which allows for a thorough comparison of the proposed scene flow approach with respect to various baseline models. In contrast to previous benchmarks, our evaluation is the first to provide stereo and optical flow ground truth for dynamic real-world urban scenes at large scale. Our experiments reveal that rigid motion segmentation can be utilized as an effective regularizer for the scene flow problem, improving upon existing two-frame scene flow methods. At the same time, our method yields plausible object segmentations without requiring an explicitly trained recognition model for a specific object class.

  • MugNet: Deep learning for hyperspectral image classification using limited samples
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-11-20
    Bin Pan, Zhenwei Shi, Xia Xu

    In recent years, deep learning based methods have attracted broad attention in the field of hyperspectral image classification. However, due to the massive parameters and the complex network structure, deep learning methods may not perform well when only few training samples are available. In this paper, we propose a small-scale data based method, multi-grained network (MugNet), to explore the application of deep learning approaches in hyperspectral image classification. MugNet could be considered as a simplified deep learning model which mainly targets at limited samples based hyperspectral image classification. Three novel strategies are proposed to construct MugNet. First, the spectral relationship among different bands, as well as the spatial correlation within neighboring pixels, are both utilized via a multi-grained scanning approach. The proposed multi-grained scanning strategy could not only extract the joint spectral-spatial information, but also combine different grains’ spectral and spatial relationship. Second, because there are abundant unlabeled pixels available in hyperspectral images, we take full advantage of these samples, and adopt a semi-supervised manner in the process of generating convolution kernels. At last, the MugNet is built upon the basis of a very simple network which does not include many hyperparameters for tuning. The performance of MugNet is evaluated on a popular and two challenging data sets, and comparison experiments with several state-of-the-art hyperspectral image classification methods are revealed.

  • A rigorous fastener inspection approach for high-speed railway from structured light sensors
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-11-17
    Qingzhou Mao, Hao Cui, Qingwu Hu, Xiaochun Ren

    Rail fasteners are critical components in high-speed railway. Therefore, they are inspected periodically to ensure the safety of high-speed trains. Manual inspection and two-dimensional visual inspection are the commonly used methods. However, both of them have drawbacks. In this paper, a rigorous high-speed railway fastener inspection approach from structured light sensors is proposed to detect damaged and loose fasteners. Firstly, precise and extremely dense point cloud of fasteners are obtained from commercial structured light sensors. With a decision tree classifier, the defects of the fasteners are classified in detail. Furthermore, a normal vector based center extraction method for complex cylindrical surface is proposed to extract the centerline of the metal clip of normal fasteners. Lastly, the looseness of the fastener is evaluated based on the extracted centerline of the metal clip. Experiments were conducted on high-speed railways to evaluate the accuracy, effectiveness, and the influence of the parameters of the proposed method. The overall precision of the decision tree classifier is over 99.8% and the root-mean-square error of looseness check is 0.15 mm, demonstrating a reliable and effective solution for high-speed railway fastener maintenance.

  • A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-11-14
    Wei Han, Ruyi Feng, Lizhe Wang, Yafan Cheng

    High resolution remote sensing (HRRS) image scene classification plays a crucial role in a wide range of applications and has been receiving significant attention. Recently, remarkable efforts have been made to develop a variety of approaches for HRRS scene classification, wherein deep-learning-based methods have achieved considerable performance in comparison with state-of-the-art methods. However, the deep-learning-based methods have faced a severe limitation that a great number of manually-annotated HRRS samples are needed to obtain a reliable model. However, there are still not sufficient annotation datasets in the field of remote sensing. In addition, it is a challenge to get a large scale HRRS image dataset due to the abundant diversities and variations in HRRS images. In order to address the problem, we propose a semi-supervised generative framework (SSGF), which combines the deep learning features, a self-label technique, and a discriminative evaluation method to complete the task of scene classification and annotating datasets. On this basis, we further develop an extended algorithm (SSGA-E) and evaluate it by exclusive experiments. The experimental results show that the SSGA-E outperforms most of the fully-supervised methods and semi-supervised methods. It has achieved the third best accuracy on the UCM dataset, the second best accuracy on the WHU-RS, the NWPU-RESISC45, and the AID datasets. The impressive results demonstrate that the proposed SSGF and the extended method is effective to solve the problem of lacking an annotated HRRS dataset, which can learn valuable information from unlabeled samples to improve classification ability and obtain a reliable annotation dataset for supervised learning.

  • Landmark based localization in urban environment
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-09-28
    Xiaozhi Qu, Bahman Soheilian, Nicolas Paparoditis

    A landmark based localization with uncertainty analysis based on cameras and geo-referenced landmarks is presented in this paper. The system is developed to adapt different camera configurations for six degree-of-freedom pose estimation. Local bundle adjustment is applied for optimization and the geo-referenced landmarks are integrated to reduce the drift. In particular, the uncertainty analysis is taken into account. On the one hand, we estimate the uncertainties of poses to predict the precision of localization. On the other hand, uncertainty propagation is considered for matching, tracking and landmark registering. The proposed method is evaluated on both KITTI benchmark and the data acquired by a mobile mapping system. In our experiments, decimeter level accuracy can be reached.

  • A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-08-02
    Ce Zhang, Xin Pan, Huapeng Li, Andy Gardiner, Isabel Sargent, Jonathon Hare, Peter M. Atkinson

    The contextual-based convolutional neural network (CNN) with deep architecture and pixel-based multilayer perceptron (MLP) with shallow structure are well-recognized neural network algorithms, representing the state-of-the-art deep learning method and the classical non-parametric machine learning approach, respectively. The two algorithms, which have very different behaviours, were integrated in a concise and effective way using a rule-based decision fusion approach for the classification of very fine spatial resolution (VFSR) remotely sensed imagery. The decision fusion rules, designed primarily based on the classification confidence of the CNN, reflect the generally complementary patterns of the individual classifiers. In consequence, the proposed ensemble classifier MLP-CNN harvests the complementary results acquired from the CNN based on deep spatial feature representation and from the MLP based on spectral discrimination. Meanwhile, limitations of the CNN due to the adoption of convolutional filters such as the uncertainty in object boundary partition and loss of useful fine spatial resolution detail were compensated. The effectiveness of the ensemble MLP-CNN classifier was tested in both urban and rural areas using aerial photography together with an additional satellite sensor dataset. The MLP-CNN classifier achieved promising performance, consistently outperforming the pixel-based MLP, spectral and textural-based MLP, and the contextual-based CNN in terms of classification accuracy. This research paves the way to effectively address the complicated problem of VFSR image classification.

  • Visual object tracking by correlation filters and online learning
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-07-29
    Xin Zhang, Gui-Song Xia, Qikai Lu, Weiming Shen, Liangpei Zhang

    Due to the complexity of background scenarios and the variation of target appearance, it is difficult to achieve high accuracy and fast speed for object tracking. Currently, correlation filters based trackers (CFTs) show promising performance in object tracking. The CFTs estimate the target’s position by correlation filters with different kinds of features. However, most of CFTs can hardly re-detect the target in the case of long-term tracking drifts. In this paper, a feature integration object tracker named correlation filters and online learning (CFOL) is proposed. CFOL estimates the target’s position and its corresponding correlation score using the same discriminative correlation filter with multi-features. To reduce tracking drifts, a new sampling and updating strategy for online learning is proposed. Experiments conducted on 51 image sequences demonstrate that the proposed algorithm is superior to the state-of-the-art approaches.

  • Learning a constrained conditional random field for enhanced segmentation of fallen trees in ALS point clouds
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-04-09
    Przemyslaw Polewski, Wei Yao, Marco Heurich, Peter Krzystek, Uwe Stilla

    In this study, we present a method for improving the quality of automatic single fallen tree stem segmentation in ALS data by applying a specialized constrained conditional random field (CRF). The entire processing pipeline is composed of two steps. First, short stem segments of equal length are detected and a subset of them is selected for further processing, while in the second step the chosen segments are merged to form entire trees. The first step is accomplished using the specialized CRF defined on the space of segment labelings, capable of finding segment candidates which are easier to merge subsequently. To achieve this, the CRF considers not only the features of every candidate individually, but incorporates pairwise spatial interactions between adjacent segments into the model. In particular, pairwise interactions include a collinearity/angular deviation probability which is learned from training data as well as the ratio of spatial overlap, whereas unary potentials encode a learned probabilistic model of the laser point distribution around each segment. Each of these components enters the CRF energy with its own balance factor. To process previously unseen data, we first calculate the subset of segments for merging on a grid of balance factors by minimizing the CRF energy. Then, we perform the merging and rank the balance configurations according to the quality of their resulting merged trees, obtained from a learned tree appearance model. The final result is derived from the top-ranked configuration. We tested our approach on 5 plots from the Bavarian Forest National Park using reference data acquired in a field inventory. Compared to our previous segment selection method without pairwise interactions, an increase in detection correctness and completeness of up to 7 and 9 percentage points, respectively, was observed.

  • Construction of pixel-level resolution DEMs from monocular images by shape and albedo from shading constrained with low-resolution DEM
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-03-27
    Bo Wu, Wai Chung Liu, Arne Grumpe, Christian Wöhler

    Lunar Digital Elevation Model (DEM) is important for lunar successful landing and exploration missions. Lunar DEMs are typically generated by photogrammetry or laser altimetry approaches. Photogrammetric methods require multiple stereo images of the region of interest and it may not be applicable in cases where stereo coverage is not available. In contrast, reflectance based shape reconstruction techniques, such as shape from shading (SfS) and shape and albedo from shading (SAfS), apply monocular images to generate DEMs with pixel-level resolution. We present a novel hierarchical SAfS method that refines a lower-resolution DEM to pixel-level resolution given a monocular image with known light source. We also estimate the corresponding pixel-wise albedo map in the process and based on that to regularize the shape reconstruction with pixel-level resolution based on the low-resolution DEM. In this study, a Lunar–Lambertian reflectance model is applied to estimate the albedo map. Experiments were carried out using monocular images from the Lunar Reconnaissance Orbiter Narrow Angle Camera (LRO NAC), with spatial resolution of 0.5–1.5 m per pixel, constrained by the Selenological and Engineering Explorer and LRO Elevation Model (SLDEM), with spatial resolution of 60 m. The results indicate that local details are well recovered by the proposed algorithm with plausible albedo estimation. The low-frequency topographic consistency depends on the quality of low-resolution DEM and the resolution difference between the image and the low-resolution DEM.

  • Disaster damage detection through synergistic use of deep learning and 3D point cloud features derived from very high resolution oblique aerial images, and multiple-kernel-learning
    ISPRS J. Photogramm. Remote Sens. (IF 6.387) Pub Date : 2017-03-09
    Anand Vetrivel, Markus Gerke, Norman Kerle, Francesco Nex, George Vosselman

    Oblique aerial images offer views of both building roofs and façades, and thus have been recognized as a potential source to detect severe building damages caused by destructive disaster events such as earthquakes. Therefore, they represent an important source of information for first responders or other stakeholders involved in the post-disaster response process. Several automated methods based on supervised learning have already been demonstrated for damage detection using oblique airborne images. However, they often do not generalize well when data from new unseen sites need to be processed, hampering their practical use. Reasons for this limitation include image and scene characteristics, though the most prominent one relates to the image features being used for training the classifier. Recently features based on deep learning approaches, such as convolutional neural networks (CNNs), have been shown to be more effective than conventional hand-crafted features, and have become the state-of-the-art in many domains, including remote sensing. Moreover, often oblique images are captured with high block overlap, facilitating the generation of dense 3D point clouds – an ideal source to derive geometric characteristics. We hypothesized that the use of CNN features, either independently or in combination with 3D point cloud features, would yield improved performance in damage detection. To this end we used CNN and 3D features, both independently and in combination, using images from manned and unmanned aerial platforms over several geographic locations that vary significantly in terms of image and scene characteristics. A multiple-kernel-learning framework, an effective way for integrating features from different modalities, was used for combining the two sets of features for classification. The results are encouraging: while CNN features produced an average classification accuracy of about 91%, the integration of 3D point cloud features led to an additional improvement of about 3% (i.e. an average classification accuracy of 94%). The significance of 3D point cloud features becomes more evident in the model transferability scenario (i.e., training and testing samples from different sites that vary slightly in the aforementioned characteristics), where the integration of CNN and 3D point cloud features significantly improved the model transferability accuracy up to a maximum of 7% compared with the accuracy achieved by CNN features alone. Overall, an average accuracy of 85% was achieved for the model transferability scenario across all experiments. Our main conclusion is that such an approach qualifies for practical use.

Some contents have been Reproduced with permission of the American Chemical Society.
Some contents have been Reproduced by permission of The Royal Society of Chemistry.
化学 • 材料 期刊列表