Crack segmentation through deep convolutional neural networks and heterogeneous image fusion

https://doi.org/10.1016/j.autcon.2021.103605Get rights and content

Abstract

A DCNN-based crack segmentation methodology is proposed by leveraging heterogeneous image fusion to alleviate image-related disturbances in intensity or range image data and mitigate uncertainties through cross-domain (i.e., intensity and range data domains) feature correlation. Intensity and range images are captured from concrete roadways and integrated through data fusion. Three encoder-decoder networks representing different patterns on exploiting the image data (i.e., fused raw image, raw range image, filtered range image, and raw intensity image) are proposed and compared to benchmarks. Experimental results demonstrate the proposed DCNN exploiting the fused raw image through an “extract-fuse” pattern achieves the most robust and accurate performance on crack segmentation among the implemented DCNNs.

Introduction

Recent decades have witnessed a dramatic increase in the interest and research efforts towards image-based crack detection [[1], [2], [3]]. Depending on the means of feature extraction, current image-based methods can be mainly categorized into two groups: non-learning-based, and learning-based methods. The non-learning-based crack detection methods, to which most of the early applications belong, typically employ handcrafted feature extraction and generation methods including filtering [4,5], thresholding [6,7], morphological operation [8], etc. In general, these handcrafted image processing techniques are usually involved with parameter selection processes, which may incur subjectivities and thus limit the application of the non-learning-based methods in real-world scenarios. Learning-based methods can alleviate such subjectivities by offering the capability to learn from data and make self-adaptations. Early learning-based methods utilized for crack detection are k-nearest neighbors [9], support vector machine [10], and artificial neural network [[11], [12], [13]]. However, despite their ability to adapt to image data, these learning-based methods only represent a low level of feature abstraction, and thus may also have difficulties in addressing problems with real-world complexities [14,15].

Deep convolutional neural network (DCNN) has rapidly evolved into one of the most advanced and popular learning-based methods for crack detection in the recent decade [1,16]. The advantage it has over the early learning-based methods is that DCNN can achieve a high level of feature abstraction to tackle complex problems through a deep multi-layer architecture layout. Some DCNN methods [[17], [18], [19], [20], [21], [22]] are patch-based; that is, images are cropped into patches which have smaller sizes, and each image patch is classified by the DCNN as containing cracks or not. Albeit its success, patch-based application has a drawback that the resolution of the detected cracks is limited. Some researchers [[23], [24], [25], [26]] adopted region-based DCNNs for object detection, which can generate bounding boxes to enclose the cracking features. Nevertheless, the issue of limited resolution still exists in some region-based DCNNs.

To achieve pixel-level resolution on crack detection, most recent applications explored crack segmentation through DCNN to predict a categorical label (e.g., “crack” vs. “non-crack”) for each image pixel. In the literature, many DCNN-based crack segmentation studies [[27], [28], [29], [30], [31], [32], [33], [34], [35], [36]] utilized intensity image due to its easy availability. In general, applications using intensity images for crack detection assume that the intensity values of cracked regions are lower than non-crack regions [37]. However, in intensity image, image disturbances such as uneven illumination, blemishes, shadows, and oil stains often exist [38,39], which may potentially cause performance degradation to DCNN-based methods. More recently, range (i.e., elevation) image was explored by some researchers [14,15,22,40,41] for crack segmentation, relying on the elevation change to interpret the presence of cracks [37]. Although range image data is advantageous over intensity data as it is insensitive to uneven illumination, this type of image data also suffers from several image disturbances. For example, pavement grooves and surface variations often exist in range image, bringing uncertainties to crack detection [39]. In the literature, image pre-processing techniques are often adopted [14,15,40] to enhance the cracking features and address the issues and disturbances prior to applying the DCNN.

In the aforementioned DCNN-based crack segmentation methods, only intensity data or range data was utilized for crack detection. As a means to resolve the disturbances existing in either type of image data, image pre-processing techniques such as background correction [42], line filtering [15], surface flattening [14], and median filtering [40] are often adopted prior to DCNN training and testing. However, in practice, it is often unclear about the appropriate choices of the parameters associated with the pre-processing procedure; furthermore, a certain level of expertise or prior knowledge is often required upon designing such a pre-processing procedure, leading to a scenario where the effect of pre-processing may be user-dependent. Therefore, uncertainties or subjectivities due to parameter selection may arise. For example, as reported in Fei, et al. [40], it is difficult to determine the optimal kernel size of the median filter to address the issue of surface variations in the range image.

The learning-based crack segmentation methodology proposed in this study explores the feasibility of applying image data fusion to alleviate the image-related disturbances through cross-domain feature correlation. Data fusion is a terminology in informatics, referring to a process to consider multiple data sources collectively rather than individually, such that new and comprehensive information is obtained and data uncertainty is mitigated [43]. Data fusion approaches can be categorized as homogeneous or heterogeneous data fusion, where the former is applied on a single data type and the latter on multiple types of data [[43], [44], [45], [46]]. A few crack detection studies leveraging homogeneous image fusion [47,48] were reported in literature. However, the combined information from homogeneous data sources may still carry the issue existing in the same type of data [46]. For example, by fusing the intensity image data, which may suffer from uneven illumination, the fused image data may still reproduce the issue incurred by uneven illumination. Thus, it is preferable in practice to leverage heterogeneous data fusion to incorporate data of diverse characteristics and further reduce the uncertainty existing in each type of data [44,46]. In a study on near-surface crack detection, Heideklang and Shokouhi [49] integrated three different data types through heterogeneous data fusion to improve the detection performance. However, their method is not deep learning-based, and hence may have difficulties adapting to data with real-world complexities. In a concrete spall detection study [50], a DCNN-based method with heterogeneous data fusion was developed using two data sources. However, the DCNN architecture only takes the intensity image data as the input, while the other type of data (i.e., depth information through a depth sensor) is not incorporated into the DCNN architecture. Zhou and Song [51] proposed a DCNN-based method with heterogeneous image fusion for crack classification; however, the developed DCNN classifiers have relatively simple layouts for patch-level classification purposes only, and thus cannot reach pixel-level crack segmentation performance. So far, no heterogeneous image fusion methods have been reported for DCNN-based roadway crack segmentation. Without pixel-level crack detection, important information on cracking features (e.g., crack width and area) cannot be readily obtained, subsequently hindering the severity rating and decision-making process needed for infrastructure maintenance practice [52,53].

This study provides a leap forward towards DCNN-based crack segmentation with heterogeneous image fusion, investigating the feasibility, efficacy, and impact of leveraging heterogeneous image data for crack segmentation. The technical merits of this study are:

  • i)

    A novel heterogeneous image fusion strategy is proposed for DCNN-based crack segmentation, integrating the information in raw range image and raw intensity image at pixel-level to address the uncertainties in individual data source;

  • ii)

    Two encoder-decoder networks incorporating the proposed heterogeneous image fusion strategy are developed and evaluated;

  • iii)

    The impact of various data types including the fused raw image, raw range image, filtered range image, and raw intensity image data on the proposed DCNNs are evaluated by cross-comparison; meanwhile, suggestions on the use of image pre-processing for DCNN-based crack segmentation studies are provided.

The rest of this paper is organized as follows. The proposed DCNN architectures and the data fusion strategy are described in Section 2 (“Proposed methodology”). Then, Section 3 (“Experimental study”) presents two experimental cases, demonstrating the efficacy of the proposed DCNN-based method with heterogeneous data fusion. Finally, Section 4 (“Conclusion”) summarizes the experimental findings and conclusions.

Section snippets

Proposed methodology

Two topics are discussed in detail in this section, which are: i) heterogeneous data as the DCNN input, and the associated image fusion strategy; and ii) the corresponding DCNN architectures based on encoder-decoder networks for crack segmentation.

Experimental study

In this section, first, the data generation process is introduced, which includes the laser camera, image acquisition and processing, ground truth generation, data augmentation, and dataset configuration; then, information on the computing environment, training configuration, and parameter initialization is introduced in the experimental setup section; finally, two experimental cases are presented.

Conclusion

This study explores the feasibility and strategy of applying DCNN with heterogeneous image fusion for roadway crack segmentation. By leveraging the spatial co-registration feature, the proposed heterogeneous image fusion strategy is beneficial in that it can reduce the uncertainty in individual data source and address the image disturbances by cross-domain feature correlation. This study investigates the following image data sources: fused raw image, raw range image, filtered range image, and

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The authors would like to thank the following support for the research and publication of this article: Alabama Department of Transportation (Project Number 930-930, Grant Number: 25663) and Alabama Transportation Institute (Grant Number: 14560).

References (68)

  • R. Heideklang et al.

    Multi-sensor image fusion at signal level for improved near-surface crack detection

    NDT & E International

    (2015)
  • G.H. Beckman et al.

    Deep learning-based automatic volumetric damage quantification using depth camera

    Autom. Constr.

    (2019)
  • C.V. Dung

    Autonomous concrete crack detection using deep fully convolutional neural network

    Autom. Constr.

    (2019)
  • T. Fawcett

    An introduction to ROC analysis

    Pattern Recogn. Lett.

    (2006)
  • K. Gopalakrishnan

    Deep learning in data-driven pavement image analysis and automated distress detection: a review

    Data

    (2018)
  • A. Mohan et al.

    Crack detection using image processing: a critical review and analysis

    Alexandria Engineering Journal

    (2017)
  • I. Abdel-Qader et al.

    Analysis of edge-detection techniques for crack identification in bridges

    J. Comput. Civ. Eng.

    (2003)
  • M. Salman et al.

    Pavement crack detection using the Gabor filter

  • Y. Fujita et al.

    A robust automatic crack detection method from noisy concrete surfaces

    Mach. Vis. Appl.

    (2011)
  • S. Wang et al.

    Pavement Crack Segmentation Algorithm Based on Local Optimal Threshold of Cracks Density Distribution, in

    (2012)
  • M.R. Jahanshahi et al.

    Unsupervised approach for autonomous pavement-defect detection and quantification using an inexpensive depth sensor

    J. Comput. Civ. Eng.

    (2012)
  • W. Zhang et al.

    Automatic crack detection and classification method for subway tunnel safety monitoring

    Sensors

    (2014)
  • G. Moussa et al.

    A new technique for automatic detection and parameters estimation of pavement crack

  • B.J. Lee et al.

    Position-invariant neural network for digital pavement crack analysis

    Computer-Aided Civil and Infrastructure Engineering

    (2004)
  • T. Saar et al.

    Automatic Asphalt Pavement Crack Detection and Classification Using Neural Networks, in: 12th Biennial Baltic Electronics Conference, Tallinn, Estonia

    (2010)
  • M.R. Jahanshahi et al.

    An innovative methodology for detection and quantification of cracks through incorporation of depth perception

    Machine Vision and Applications

    (2013)
  • A. Zhang et al.

    Deep learning-based fully automated pavement crack detection on 3D asphalt surfaces with an improved CrackNet

    J. Comput. Civ. Eng.

    (2018)
  • A. Zhang et al.

    Automated pixel-level pavement crack detection on 3D asphalt surfaces using a deep-learning network

    Computer-Aided Civil and Infrastructure Engineering

    (2017)
  • A. Garcia-Garcia et al.

    A review on deep learning techniques applied to semantic segmentation

    arXiv

    (2017)
  • S. Faghih-Roohi et al.

    Deep Convolutional Neural Networks for Detection of Rail Surface Defects, in: 2016 IEEE International Joint Conference on Neural Networks

    (2016)
  • L. Zhang et al.

    Road Crack Detection Using Deep Convolutional Neural Network, in: 2016 IEEE International Conference on Image Processing

    (2016)
  • Y.J. Cha et al.

    Deep learning-based crack damage detection using convolutional neural networks

    Computer-Aided Civil and Infrastructure Engineering

    (2017)
  • S. Park et al.

    Patch-based crack detection in black box images using convolutional neural networks

    J. Comput. Civ. Eng.

    (2019)
  • Y.J. Cha et al.

    Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types

    Computer-Aided Civil and Infrastructure Engineering

    (2018)
  • Cited by (31)

    • Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance

      2023, Automation in Construction
      Citation Excerpt :

      As reported in [21], by leveraging the spatial co-registration features in the fused 2D intensity and 3D range image, the proposed DCNN utilizing data fusion yielded better performance than using the 2D intensity or 3D range image individually. As illustrated by the flow chart in Fig. 3, fused image data are acquired by combining the raw range image and raw intensity image through depth concatenation, following the procedures in [21]. The core concept of the data fusion technique [21] for crack segmentation is to leverage the spatial co-registration features in the raw range and intensity images, which are enabled by the laser camera, to alleviate the data-related issues in each individual data source and reduce uncertainties.

    View all citing articles on Scopus
    View full text