Abstract

The present work aimed at the problems of less negative samples and more positive samples in rail fastener fault diagnosis and low detection accuracy of heavy manual patrol inspection tasks. Exploiting the capacity of a Convolution Neural Network (CNN) to process unbalanced data to solve tedious and inefficient manual processing, a fault diagnosis method based on a Generative Adversarial Network (GAN) and a Residual Network (ResNet) was developed. First, GAN was used to track the distribution of rail fastener failure data. To study the noise distribution, the mapping relationship between image data was established. Additional real fault samples were then generated to balance and extend the existing data sets, and these data sets were used as input to ResNet for recognition and detection training. Finally, the average accuracy of multiple experiments was used as the evaluation index. The experimental results revealed that the fault diagnosis of rail fastener based on GAN and ResNet could improve the fault detection accuracy in the case of a serious shortage of fault data.

1. Introduction

With the rapid development of the economy, rail transit has become the primary travel medium for people. However, substantial traffic and transportation pressure are the two main challenges to the safety of rail transit [13]. Many advanced fault detection methods have been adopted for trains, and the safety of tracks should not be ignored [46]. As fasteners bear the change of alternating stresses in a long-term reversion process, they are prone to fracture [7, 8]. At present, the inspection of track fasteners mainly relies on manual inspection, and the state of fasteners is judged by walking along the track. This method is inefficient and expensive and relies on the experience of the workers. With the rapid development of image processing, computer vision, machine learning, and neural networks [9, 10], advanced technologies are considered to reduce the time and cost and improve detection accuracy [11, 12].

Liu et al. [13] proposed a method based on image fusion features and Bayesian compression image classification recognition to detect the state of fasteners by extracting IEOH and MSLBP features. Mao et al. [14] obtained a precise and extremely dense point cloud of fasteners through structural light sensors and proposed a complex cylindrical surface center extraction method based on the vanilla vector to extract the centerline of ordinary fastener metal clips. Finally, the looseness of the fasteners was evaluated according to the extracted metal clip centerline. Gilbert et al. [15] improved the robustness of fastener detection by adjusting the training data, mining the difficult data, and analyzing the features extracted from these edges with a support vector machine (SVM) classifier [16].

Wei et al. [17] proposed a new method for defect detection of fasteners based on dense sieve features. Visual geometry group 16 (VGG16) [18] was used for feature extraction of fasteners, using a faster r-CNN fastener for defect detection, to improve the detection rate and efficiency. Liu et al. [19] proposed a new method to detect partially worn and intact missing fasteners.

In recent years, the problem of data imbalance has been prevalent in face recognition, satellite image detection, medical diagnostic decision-making, and other fields [20]. Class imbalance can impair the predictive ability of the classification algorithm because the algorithm pursues the overall classification accuracy. Therefore, increasing attention is paid to the problem of data imbalance.

In the traditional approach, the target cost function is modified by weighting small-category data and large-category data samples differently and assigning higher weights to small-category data samples, and thus imposing more severe penalties for the misclassification of small-category data samples. Sampling methods include undersampling and oversampling, the reduction of the size of sample categories by deleting samples from some large data categories, and the increase of samples of subclasses. Random undersampling and random oversampling are the main methods of undersampling and oversampling.

The existing intelligent fault diagnosis methods work on the premise of obtaining balanced training data, which indicates that the same number of marker samples should be used for training under different experimental conditions. Suh et al. [21] reduced the level of data imbalance through oversampling and improved the accuracy. Hao and Liu [22] showed that the proposed the K-means synthetic minority oversampling technique (SMOTE) method is then used for oversampling the minority class samples.

Generative Adversarial Network (GAN) does well in some specific fields, such as human–computer interaction, computer vision, natural language processing, etc. [23, 24]. Bissoto et al. [25] employed pix2pixHD GAN for image synthesis. They synthesized from semantic maps and instance maps, rather than from random noise. However, for the intelligent detection of track fasteners, it is impossible to obtain a large amount of fault data in a short time to balance data sets for network training. Learning from unbalanced data of different categories is a long-standing problem, and the distribution of training data between different object classes becomes severely skewed [26]. In the case of unbalanced training data, most of the existing learning algorithms generate inductive bias to a big data category, leading to poor recognition performance and low accuracy of small data classes [27]. Figure 1 expresses that unbalanced data cause errors in data classification and explains the process of using manually generated data to compensate for unbalanced classification of classes. With the development of the convolutional neural network (CNN) [28] technology, great success has been achieved in the field of image recognition and detection; however, these algorithms are not designed for rail transit. The present work proposes a fault diagnosis method for track fasteners based on deep learning. The method consisted of two parts: (1) the generative adversarial network (GAN) was used to learn the distribution of real data samples, and additional fault samples were generated to balance and train the data sets; (2) residual network (ResNet) was used for track fastener fault diagnosis and classification, and the extended data set was used for group training and validation.

The main contributions of the present paper are as follows:(1)The proposed method used GAN and ResNet models for rail fasteners in unbalanced data(2)The good performance of GAN in solving unbalanced data was applied to rail fasteners(3)The excellent performance of ResNet in image classification was combined with the data generation capability of GAN to expand the amount of data(4)It provides an idea for intelligent detection of failure data that cannot be obtained by experiment or within a short time

The rest of the paper is organized as follows. An introduction to theoretical fundamentals is provided in Section 2; Section 3 presents the experimental data; Section 4 completes the experimental verification of data and methods; and Section 5 concludes the paper.

2. Theoretical Background

Deep learning models manifest a relatively advanced performance in different tasks using high-dimensional balanced data sets, and these models can optimize millions of parameters through tagged data training [29]. Due to a large number of model parameters, it is easy to overfit small data and thus make the generalization performance proportional to the category proportion of marked data. Some traditional methods, such as variable target cost function, sampling methods, and manual data generation, are available to deal with class imbalance.

2.1. Generative Adversarial Network

Most of the existing sampling methods do not consider the underlying distribution of real data and result in incorrect samples. Artificial neural networks (ANNs) have a good application prospect in data generation [30, 31]. In 2014, Goodfellow et al. from the University of Montreal first proposed ANNs. GAN is used to solve the problem of data imbalance as well as to generate mechanical data samples [3234]. Khan et al. [35] used GAN for the damage detection and life prediction of bearings. Wang et al. [36] also applied GAN for the fault detection of a planetary gearbox with unbalanced data.

GAN generally consists of two modules: the generating network (G) and discriminant network (D), which are parameterized deep neural networks. Figure 2 presents a structural diagram of the GAN algorithm. Input is the input of G, and it is generally the noise data generated by the Gaussian random distribution. The output of G is G(Z), and for real data, it is usually a picture. For the output of D, the distribution variable is expressed by X, and it is possible to judge the possibility of X, where G(Z) represents the noise distribution.

The goal of GAN is to use random noise learning to generate the distributed input of a network on real data. The identification network is updated through resistance training to accurately identify true and false data, optimize the generation network, and generate real samples that cannot be recognized by the identifier. The principle of GAN can be summarized as follows:where is the true data distribution. By generating a confrontation training between the network and the discriminant network , the noise distribution can be projected to a position similar to , thus generating more real negative sample data. For a specific sample, it may come from a true distribution or it may come from a generated distribution, so its contribution to the discriminator’s loss function iswhere is the true distribution and is the generated distribution. If the derivative with respect to D(x) is zero, the global optimal solution of D(x) can be obtained as follows:

The optimization function of the generator can be obtained by substituting the optimal discriminator into

The expressions of JS divergence and KL divergence are as follows:

Clearly, equation (5) is similar to the JS divergence form, which can be converted into

In conclusion, it can be considered that when the discriminator is too optimal, the loss of the generator can be approximately equivalent to the JS divergence of the real sample distribution and the generated sample distribution by the generator.

2.2. Residual Learning Module

The basic network of deep learning is from AlexNet and VGG to GoogLeNet. A deeper network structure can extract more complex features; however, the deeper the network, the lower the accuracy. A network model can solve the problems of network gradient disappearance or explosion and precision degradation, and it is easier to optimize the model.

ResNet network is a reference to VGG19 network, based on this modification, and through the short circuit of the residual unit. In the change of ResNet, the stride-2 of convolution is directly used as sampling, and the global average pool is used to replace the whole connection layer. An important design principle of ResNet is to maintain the complexity of the network layer by doubling the number of maps when the feature size of the feature map is reduced. ResNet adds a short circuit mechanism between every two layers compared with ordinary networks, which results in residual learning. Table 1 shows the different depths of ResNet. In this article, we select Resnet50 network for rail fastener detection.

ResNet introduces residual learning with as the input and as the output. A convolution neural network learns directly through training, whereas residual learning uses several network layers with parameters to learn the residual between input and output, and the final output of the residual unit is [37, 38].

Residual learning module, shown in Figure 3, is defined as follows:where x and y are the input and output vectors of the network layer, and the function represents the residual mapping to be learned.

From Figure 4, the function can represent multiple convolutional layers. To simplify the notation, ReLU activation function and deviation are omitted. The F(x) + X operation is realized by fast connection and element-by-element addition on a channel-by-channel basis. Assuming that the two-layer neural network is activated in the L layer to obtain , it can be simply written as follows:

By continuously stacking the number of network layers, the expression of the characteristics of any deep unit L can be obtained as follows:

Assuming that is a loss function, the following equation can be obtained from the chain rule of backpropagation:

This derivative can be divided into two parts: not passing through the weight layer and passing through the weight layer. The former provides a guarantee for the signal to be transmitted directly to the shallow layer and avoids the phenomenon of gradient disappearance because cannot be −1.

3. Data Preparation

3.1. Experimental Flow

By integrating the rail fastener fault diagnosis module into the confrontation training program, it was optimized by real data and generated data. A rail fastener fault diagnosis method was proposed based on GAN and ResNet. The method consisted of two parts. In the first part, GAN was used to learn the distribution of real data samples, and the training data sets were balanced and expanded by generating additional real negative samples. The second part used a ResNet network to train ended data sets during identification and diagnosis. The diagnostic flowchart for rail fasteners is presented in Figure 4. Fault class samples were generated, and the training data sets were expanded explicitly to further improve the performance of the fault diagnosis model.

3.2. Data Sources

The data of the current experiment consisted of a picture of a rail fastener of a subway operation line, which contained a large number of complete fasteners and a small number of faulty fasteners (the main faults were loss and fracture). Figure 5 presents a sample of real data. These fastener images have some characteristics, the texture feature is different from other parts, and the length and width of the fastener do not exceed the length and width of the bottom plate.

Figure 6 presents the data acquisition equipment. The data acquisition equipment can adapt to track data acquisition of normal gauge by installing a charge coupled device (CCD) linear array camera on the acquisition equipment to collect images. The field of view of the collected images included the rail and fasteners on both sides. The images were preprocessed, the fastener area was extracted, and the feature description of the fastener was established.

3.3. Data Generation

A GAN network was used to learn the distribution of track fastener image data and generate simulated real samples to expand the data set. The fault diagnosis of track fasteners was studied under different states using multiple networks by distributed learning. Multiple learning modules were set up to generate different types of fasteners at the same time. Each module learned the data distribution of each rail fastener individually. Figure 7 exhibits the data generation structure and multiple learning modules.

The input noise is a standard Gaussian distribution Z. In the generation network, a filter with two input full-connection layers and three convolution layers (filter sizes were 128, 64, and 32) was used. The identification network used two convolution layers with filter sizes of 64 and 32. The ReLU unit activation function was generally used throughout the network, and the Adam optimizer was used in this model. Tables 2 and 3 present the algorithms for generating and identifying networks, respectively.

The generator network consists of 100 input neurons, a hidden layer containing 128 neurons, and an output of 784 = 28 × 28 neurons. The discriminator network is composed of 784 = 28 × 28 input neurons, a hidden layer containing 128 neurons, and 1 output neuron. The output neuron is the value between 0 and 1, which represents the probability that the output is true. Figures 8 and 9, respectively, represent the structure of the generator network and the discriminator network. Figure 10 shows the dynamic architecture of GAN.

During the continuous training of the confrontation neural network, when the two networks reach the Nash equilibrium point, the losses of the generated and discriminator network decreased and gradually approached the minimum value. Figure 11 presents the observed G-loss curve of the generated network and the D-loss curve of the distinguishing network. The partial magnification of the loss curves reveals that the two networks gradually reached Nash equilibrium after 3000 steps.

To improve the model recognition accuracy, GAN was used to generate images of fractured track fasteners with different noises and lost track fasteners in different training cycles. Figure 12 displays the sample data generated in different training cycles.

4. Validation of the Method

4.1. Building Data Sets

A deep learning model for rail fastener fault diagnosis was established based on the ResNet model. The failure data of rail fasteners generated by the GAN network were combined with the intact rail fastener data to form a large data set. As the output of the CNN model follows a probability distribution, the input label must follow a probability distribution so that the cross-entropies between the model output and the input label can be calculated and the network can be optimized. Aiming at the above problem, all samples in the data set were labeled with the one-hot code, which converted the number into a vector with only one bit. Table 4 presents the tags with one-hot code.

This experimental algorithm was designed and developed based on the open-source deep learning framework of TensorFlow, and all experimental data sets were stored through the unified storage format “TFRecord” provided by TensorFlow.

4.2. Experimental Results and Analysis

The current experimental algorithm was developed in the framework of TensorFlow and used an ResNet 18-layer network as the fault diagnosis model. The hardware of the model consisted of two Intel Xeon CPU E5-2678 V3 processors (frequency = 2.5 GHz), the WIN10 64-bit operating system, and two NVIDIA’s TITAN RTX display cards. In each type of data set, 4000 samples were randomly selected to establish a training set of 12,000 samples, and 1000 samples were randomly selected from the remaining data set to form a test set of 3,000 samples. The batch size was set to 100, the number of training iterations was 2000, and the learning rate was 0.001.

In this case, fault identification experiments will be conducted for the rail fasteners. Three health conditions are simulated in the experiments, which are intactness, fracture, and loss. To improve the reliability of the obtained results, the experiment was divided into five groups. The training set includes a total of 2400 fault samples in each group, and each health condition has 800 fault samples. The validation set and test set each contain 600 failure samples, and each failure type has 200 failure samples. Finally, the average value was taken as the final accuracy of the GAN + ResNet method. Table 5 and Figure 13 present the accuracy of all experimental training and test sets.

Table 6 presents the IDs of different epochs of the GAN network training data group. Table 6 presents the data composition for experimental verification, and Table 7 depicts the accuracy of different methods. Figure 14 presents the accuracies of the different methods; LBP represents the local binary mode that can effectively describe textures, and HOG represents the directional gradient histogram, which is a common feature to describe local textures in computer vision and pattern recognition. VGG16 is a classical deep learning model with a good classification ability.

The same data set was used in LBP + SVM, HOG + SVM, VGG16, and GAN + ResNet. The t-SNE algorithm could reduce the dimensionality, visualize the model forecast on the test set, and display the output multidimensional forecast data on the two-dimensional space. It was found that the GAN + ResNet method could classify the data well, and its classification effect was clearly better than other methods. Figure 15 displays the test data distributions for different models.

To further analyze the diagnostic effect of each method on different faults, the classification results of each method were visualized by the confusion matrix. In Figure 16, the green squares are the correct number of samples to diagnose and the blue squares are the number of samples to diagnose errors.

The recall rate and classification accuracy of different methods were calculated by the confusion matrix. Tables 8 and 9 present the recall rates and classification accuracies of different methods, respectively. It is evident that the proposed method could accurately classify each type of fastener with a general accuracy of over 98%.

Through the above experiments, it can be seen that ResNet has the best effect. Classification algorithms are not sensitive to unbalanced data sets; therefore, it is difficult to classify unbalanced data sets. The main processing methods of unbalanced data sets are sampling, which can be divided into undersampling and oversampling.

To verify the performance of GAN, fault identification experiments will be carried out on track fasteners. The fracture and loss are grouped together and the two kinds of failure of the coupler are simulated as health and disease, respectively. Among them, there were 200 samples of health and 40 samples of disease, with a ratio of 5:1. The methods of oversampling and undersampling are, respectively, adopted to verify on ResNet. Then, 80 disease samples were generated by GAN, and the proportion of healthy samples and disease samples was 1:1. Input the sample into the ResNet network for training. Finally, the visualization is carried out through a confusion matrix.

Figure 17 shows the confounding matrix of different data processing methods and Table 10 shows the classification result index. Through comparison, it is found that undersampling is used to reduce the sample number of majority classes which is inevitable to cause information loss problem, thus resulting in the worst effect. Oversampling method is to copy a few samples to increase the number of samples in a few classes. However, an overfitting problem can be emerged since there is no extra new information incorporated. The synthetic images from the GAN model were added to the training set to help train the ResNet for better classification performance.

5. Conclusions

The present paper proposed a rail fastener fault diagnosis method based on deep learning. GAN and ResNet were used to solve the problem of data imbalance. The main observations are as follows:(1)The mapping relationship between noise distribution and real track fault data was established through the countermeasure training of the generated network and the discriminant network and was used to generate additional negative samples to balance and further expand the training data set.(2)Based on the experimental verification of faulty and intact fastener samples, it was observed from the t-SNE algorithm that the detection effect of the GAN + ResNet method was more obvious than the other methods.

The proposed rail fastener fault diagnosis method based on GAN and ResNet could improve the accuracy of fault detection in the case of insufficient fault data and provide a new idea for fault diagnosis in the case of an unbalanced rail direction data category.

Data Availability

The data used to support the findings of the study cannot be made publicly available because the permit to make them available is not provided by the funders.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The present work was funded by the National Natural Science Foundation of China (Grant nos. 51605023 and 51975038), the Natural Science Foundation of Beijing, China (Grant no. L191005), the Support Plan for the Development of High-Level Teachers in Beijing Municipal Universities (Grant nos. CIT&TCD201904062 and CIT&TCD201704052), the General Project of Scientific Research Program of Beijing Education Commission (Grant no. SQKM201810016015), the Scientific Research Fund of Beijing University of Civil Engineering Architecture (Grant no. ZF15068), the Graduate Innovation Program of Beijing University of Civil Engineering Architecture (Grant no. PG2020092), and the Fundamental Research Funds for Beijing University of Civil Engineering and Architecture (Grant no. X18133).