Abstract

Large-scale mechanical equipment monitoring involves various kinds and quantities of information, and the present research on multisensor information fusion may face problems of information conflicts and modeling complexity. This paper proposes an analysis method combining correlation analysis and deep learning. According to the characteristics of monitoring data, three types of correlation coefficients between sensors in different states are obtained, and a new composite correlation analytical matrix is established to fuse the multisource heterogeneous data. The matrix represents fault feature information of different equipment states and helps further image generation. Meanwhile, a convolutional neural network-based deep learning method is developed to process the matrix and to discover the relationship between results and equipment states for fault diagnosis. To verify the method of this paper, experimental and field case studies are performed. The results show that it can accurately identify fault states and has higher diagnostic efficiency and accuracy than traditional methods.

1. Introduction

With the development of technology, mechanical equipment presents the characteristics of complexity, automation, and high speed, which greatly increases the difficulty of equipment condition monitoring and fault diagnosis. Multiple sensors can receive more fault information and improve the accuracy of fault diagnosis [1, 2], and thus a large number of different types of them are applied for data collection in the monitoring of the large-scale mechanical equipment [35], which has produced various kinds and quantities of monitoring data. How to make effective use of these multisensor data to improve the accuracy of the equipment fault diagnosis has become a hot issue.

Due to many data sources and different types, data fusion is needed in order to perform the fault diagnosis. Generally, there are two kinds of fault diagnosis based on multiple sensors: the decision-level fusion diagnosis and the feature-level fusion diagnosis, respectively. For the decision-level fusion diagnosis, methods like DS evidence theory [69] and fuzzy decision theory [10, 11] are mostly used. These kinds of methods firstly use a single sensor to identify the information of the equipment state, generate the evidence, and then make the final decision with certain rules. However, due to the nonlinear and nonstationary equipment states, there are often evidence conflicts problems of information from different sensors. Therefore, the methods of information entropy [6, 8], weighted combination [7, 9], and fuzzy reasoning [10] are needed to solve these problems, which lack universality. The feature-level fusion is to extract the features of a single sensor signal by using the time-frequency domain analysis method of fast Fourier transform [2] and wavelet transform [3], then combine the features of all sensors into a new feature vector according to certain methods, and then use the machine learning methods such as support vector machine (SVM) [12], artificial neural networks (ANNs) [13], and K-nearest neighbor (KNN) [14] to diagnose the fault. Nevertheless, with the increase of the sensors, the dimension of the feature vector becomes too high; to cope with this, some scholars have adopted the statistical analysis [13], the principal component analysis (PCA) [15], the relevance vector machine (RVM) [16], and other methods for data dimensionality reduction. Also, deep learning approaches such as convolutional neural network (CNN) [17, 18] and long short-term memory (LSTM) [19, 20] are applied for information fusion and fault diagnosis. Relative to the diagnosis method based on the decision-level fusion, feature-level fusion diagnosis methods have better adaptability and diagnostic accuracy. However, this kind of method is mainly adaptable for the same type of sensors; when sensors of different kinds are analyzed, effective information is often lost due to differences in the characteristics and distribution of data.

Most previous studies treated the information of multiple sensors as a single signal and ignored the coupling relationship between signals, resulting in the loss of effective information [14]. However, for certain mechanical equipment, the interaction and mutual influence occurs between subsystems and there must be some relationship between the different sources of monitoring data. When the equipment state changes, the corresponding relations will also change [21]. Correlation analysis, as one kind of mathematical method to analyze the changes of the correlation between different variables, can describe the overall change rules of equipment states, and thus it could applied to the field of fault signal analysis [2225] and the multisensor fault diagnosis [2629]. Kang et al. [21] correlated the different battery voltages with different sensors and used the recursive correlation coefficient to diagnose the fault signals and accurately identified the locations and types of faults. Xiong et al. [27] proposed a new fault diagnosis method for the rotating machinery fusing the dimensionless indices and the Pearson correlation coefficient. They then analyzed the variation of the correlations among the multiple sensors by calculating the correlation coefficients and derived the threshold value for fault diagnosis. Zhou et al. [28] developed a fast fault detection and location method based on capacitive voltage similarity analysis, in which the correlation coefficient was adopted to represent the fault state and the threshold was used for fast fault detection and location (FDL). Zhao et al. [29] established the feature matrix of the expanded acoustic vibration signal by using the value of Pearson correlation coefficient as the matrix element and performed fault diagnosis by using convolutional neural network and thus realized the effective utilization of sound and vibration signals. The results of multisensor correlation analysis are usually presented in the form of matrix, but the conventional threshold processing methods only diagnose by individual correlation coefficient, losing the meaning of correlation analysis. When intelligent diagnosis methods are adopted, conventional methods are difficult to directly deal with the matrix, and one-dimensional processing will lead to the problem of high dimension. At the same time, because different types of correlation analysis methods have different requirements on data distribution, the distribution laws of equipment monitor data of different types are quite different, so it is necessary to establish a new correlation analysis method to adapt to different types of data.

In view of this, this paper proposes a new analysis method combining composite correlation analysis and deep learning theory. According to the characteristics of the traditional correlation analysis, a kind of composite correlation coefficient is designed and calculated between sensors of different states. The correlation coefficient matrix is treated as representation of fault feature information of different equipment states and is transformed into images. Deep mining of the correlation coefficient is performed by the deep learning method with multilayer network, and thus the relationship between the correlation coefficient and equipment state is obtained for fault diagnosis. CNN is selected for fault identification in this paper because it has been successfully applied in the field of mechanical equipment fault diagnosis and has shown good performance [30, 31]. The specific procedures are as follows: firstly, according to the data characteristics of different signal types, data preprocessing is carried out to obtain all kinds of eigenvalues, and eigenvalue matrices under different working states are constructed; then, the composite correlation coefficient is calculated on the eigenvalue matrix to obtain the correlation coefficient matrix representing the relationship among the sensors and to realize the image generation of sensor data; finally, the deep convolutional neural network algorithm is used to directly identify the correlation coefficient matrix images so as to avoid the influence of artificially selecting features and improve the accuracy and efficiency of diagnosis. Analysis of rotor faults experiments verifies the efficiency of the proposed method in complex fault diagnosis by multisource sensor fusion. The results show that the proposed method can accurately identify fault states and has higher accuracy than the traditional method.

Through the above analysis, the main contributions in this paper involve the following. (1) A fault information fusion method is proposed for multisource heterogeneous data, and a new correlation analysis matrix is established with the comprehensive advantages of several related analysis methods. By analyzing the correlation of multisource sensors, the changes of equipment states are represented, and the image generation of monitoring data is realized. It can reduce the data dimension, improve the computational efficiency, and prevent the fault information loss caused by the direct comparison or normalization between data of different types and different orders of magnitude. (2) Considering the characteristic of high dimension and large amount of the monitoring data, a fault diagnosis model based on correlation analysis and deep learning is built which directly trains and recognizes the correlation matrix image of a large number of monitoring data. The method avoids the issues of low efficiency and dimensionality curse by the conventional pattern recognition method for large and high-dimensional data, and the fault diagnosis accuracy is improved.

2. Theoretical Background

2.1. Correlation Analysis

Correlation analysis generally refers to the analysis method to study the relationship between variables, that is, to study the change relationship of another variable when one changes. The value describing this relationship is called correlation coefficient. Data of sensors in mechanical equipment monitoring are continuous variables. For continuous variables correlation analysis, correlation coefficients include Pearson and Spearman correlation coefficients, and multiple correlation analysis is commonly used. Meanwhile, since this paper focuses on the relationship between different sensors, it needs the correlation analysis of signal from one-to-one and one-to-many sensors, so the above three correlation analysis methods are discussed in the paper.

2.1.1. Pearson Correlation Analysis

Pearson correlation coefficient is a statistical index describing the degree of correlation between variables, and the value varies between −1 and 1. When the changes between two variables are consistent, the value is greater than 0, and especially, it is called complete correlation when the value is 1. On the contrary, when the changes between two variables are opposite, which is called negative correlation, the value is less than 0. When there is no correlation between the changes of two variables, the value is 0, which is called uncorrelated. Let (xi,yi) (i = 1, 2, …, n) be the sample from the continuous monitoring variable (xi, yi); then, the Pearson correlation coefficient iswhere is Pearson correlation coefficient; and are the mean values of x and y samples, respectively; and n is the length of the variable. Pearson correlation coefficient requires the variable to conform to the normal distribution, and if not, the calculated results will be greatly deviated.

2.1.2. Spearman Correlation Analysis

Spearman rank correlation is a method to study the correlation between two variables based on rank data. The data requirements of Spearman rank correlation are less strict than Pearson’s correlation. As long as the observed values of two variables are rank data in pairs or the ones transformed by continuous variable observation data, Spearman rank correlation can be used, regardless of the overall distribution of the two variables and the size of samples. Spearman correlation coefficient rs is calculated as follows:where R(xi) and R(yi) are the rank of the two variables xi and yi in their respective column vectors and n is the length of the variable. Spearman correlation coefficient is improved from Pearson correlation coefficient and has the effect of eliminating its error, which is suitable for the case when the normal distribution is not satisfied. However, due to the need of data ranking during the calculation, the effect is worse than Pearson correlation coefficient when normal distribution is satisfied.

2.1.3. Complex Correlation Analysis

In practical analysis, a variable is often subject to the comprehensive influence of a variety of variables. The so-called complex correlation means to study the correlation between multiple variables with one at the same time. The index to measure the degree of complex correlation is the complex correlation coefficient. The correlation between multiple variables with one at the same time cannot be directly measured, and only indirect calculation can be done. The complex correlation coefficient rc between a variable y and multiple variables xi (i = 1, 2, …, n) is calculated as follows:where is linear regression of y to xi and is the mean value of y. Complex correlation coefficient is often used in multiple linear regression analysis. The correlation degree between the dependent variable and a group of independent variables is expected, which is called complex correlation, and the coefficient reflects the “close” degree between one and a group of variables.

2.2. Convolutional Neural Network Theory

Convolutional neural network is a kind of bionics algorithm imitating biological neural network. Being a representative algorithm in deep learning theory, the difference with the traditional neural network is that it has a deeper network structure in order to simulate biological neural networks more accurately. The traditional neural network has problems in establishing the multilayer structure, such as the complex network, too many nodes and parameters, slow convergence, and computational difficulty. Convolutional neural network avoids these problems by the approach of local connections and weight sharing [32].

Local connection is different from the full connection in the traditional neural network. It means that the neuron nodes in a certain layer of the neural network are not connected to all the neurons in the upper and lower adjacent layers, but only connected to part of adjacent neurons according to certain rules. In this way, the number of neurons is greatly reduced, and the size of the neural network is decreased. Especially when dealing with high-dimensional data, due to the complexity of network structure and exponentially increasing neuron data, it is difficult to apply the full connection method. However, through local connection, the neural network structure is simplified with less parameter, and the network availability is improved.

Weight sharing is another characteristic of convolutional neural networks. The concept refers to the fact that in a locally connected network, the parameters are same when different upper and lower neurons are connected. On the basis of local connection, this method greatly reduces the number of parameters and improves the generalization ability of the network.

In addition to the above two characteristics, the network structure of convolutional neural network is different from traditional one, which has more complex structure and more layers. It usually consists of input layer, multiple convolution layers, pooling layers, fully connected layer, and output layer, and the convolution and pooling layers are unique to CNN, as shown in Figure 1.

When training by the convolutional neural network, if the training data are too less or the data image is too large, overfitting phenomenon may occur. Although better training accuracy can be obtained, the recognition rate is lower when the trained model is applied toother data.

In order to solve this problem, Alex [33] proposed the dropout technology; in this method, some neurons stop working when training, which allows a neuron to not entirely depend on others. Thus, the neural network is decomposed into multiple subnetworks, with weight sharing and the same number of layers, and finally the results of each subnetwork are averaged. Through this method, one can improve the network generalization ability and stability and reduce overfitting of the network [34].

Figure 2 shows the difference between a fully connected neural network and the one based on dropout technique. The calculation of neural network with full connection is performed as follows:

By using dropout, the calculation equation is [35]

In equation (5), the role of Bernoulli function is to make to be the value 1 or 0 with probability p so as to realize the shielding of neurons assigned as 0.

3. Proposed Method

In fault diagnosis of mechanical equipment based on multisensor information, the key problem lies in how to make comprehensive use of all monitoring information. In order to find an effective way to realize the diagnosis by multisource heterogeneous multisensor information fusion, a method based on composite correlation analysis and deep learning is proposed in the paper, and a new correlation matrix is set up to represent the correlation changing relationship between different sensors, and the 1-dimensional data are transformed into 2-dimensional images. Then, combining the deep convolutional neural network, the fault diagnosis model is built to directly analyze the image to realize the fault diagnosis and to improve fault recognition accuracy. The framework of the method is shown in Figure 3. It is composed of several modules including data acquisition, feature extraction, correlation analysis based on feature fusion, and fault classification.

Different kinds of sensors are used to collect the monitoring data of mechanical equipment, and feature extraction is performed on the data according to the characteristics of different sensors. Then, the characteristic values are used to analyze the correlation between different sensors, and the new composite correlation matrix is calculated and visualized by image generation. Then, a fault diagnosis model based on deep convolutional neural network is established to classify the fault and to identify the fault pattern.

3.1. Composite Correlation Analysis Matrix

The multisource sensor information fusion method studied in this paper makes use of the interrelationship between different sensors, which requires correlation analysis of multidimensional eigenvalues of signals collected by different sensors. According to Section 2, the commonly used Pearson correlation analysis requires the to-be-analyzed data satisfying normal distribution; however, in mechanical fault diagnosis, due to the nonstationarity and noise influence of data, part of the data satisfies normal distribution, while the other part does not. At the same time, when the fault occurs, it may cause correlation changes between two sensors, or a sensor with other multiple ones. In this situation, the traditional correlation analysis method is unable to deal with the problem simultaneously. Therefore, based on the characteristics of Pearson, Spearman, and complex correlation coefficient, a comprehensive correlation analysis coefficient rn is established in this paper. For n sensors, each characteristic vector is calculated, respectively, to form an eigenvalue matrix with n columns. As to the characteristic vector xi, xj (i = 1, 2, …, n) of arbitrary two sensors, the composite correlation coefficient calculation equation is as follows:where each item is the same as in equations (1)–(3).

According to equation (6), the element rn(xi, xj) in the composite correlation matrix is calculated differently according to different indices; when the index i is greater than j, Pearson correlation coefficient equation is used, when the index i is equal to j, the complex correlation coefficient equation is used, and when the index i is less than j, the Spearman correlation coefficient equation is used.Through the correlation coefficient matrix calculation by this method, the three methods of correlation coefficient are combined to form the composite correlation analysis matrix, which realizes the comprehensive consideration of the changes between sensors under different states and avoids the loss of information.

3.2. Establishment of Fault Diagnosis Model Based on Convolutional Neural Network

How to use the composite correlation coefficient matrix of multisource sensors to build the fault diagnosis model has great influence on diagnosis accuracy . The traditional methods need to transform the matrix into a one-dimensional vector, and then methods such as neural network, SVM, and cluster analysis are used for diagnosis. According to the characteristics of composite correlation coefficient matrix, the diagnosis model based on convolutional neural network is established, and then processing is performed in the form of a two-dimensional matrix in order to improve the calculation efficiency and accuracy.

When using the convolutional neural network for classification, the input of training and classification model is required to be images. Therefore, the composite correlation coefficient matrix is firstly visualized by image generation. Since the matrix is two-dimensional, it is transformed into an 8 bit gray image in this paper as follows:where x represents the element in the composite correlation coefficient matrix, xmax and xmin are the maximum and minimum values of the elements, and is the gray value of the corresponding image pixel after transformation.

According to the requirements of mechanical equipment monitoring and the characteristics of correlation coefficient matrix, the CNN structure is designed for the gray image of composite correlation coefficient matrix in this paper, as illustrated in Figure 4, where the number of the convolution layers and the number of pooling layers are both two, the number and size of convolution kernel of each layer are determined according to the actual situation, and the activation function of each layer is chosen as the ReLU function. In order to avoid the overfitting phenomenon in the training of convolutional neural network, the second pooling layer is followed by a single-layer perceptron and then connected to a dropout layer with the shielding probability of 0.4. Softmax classifier is adopted in the output layer, and the final output is the fault recognition result.

At the same time, because the speed of parameter updation in the model is determined by the learning rate, the adaptive learning rate is used in the experiment to speed up model optimization, guaranteeing better performance than experience-based approaches. The initial learning rate is set to 0.001. During the training, at the end of each epoch, loss and precision of the current model are evaluated in the validation set. The loss value changes are checked every other epoch, and when it is less than 0.0001, the learning rate lr is attenuated, with the calculation of attenuated lr∗ as follows:

4. Experimental Studies

A variety of different faults might occur in the runtime of large mechanical equipment; sometimes, even two or more concurrent or coupling faults occur at the same time. In order to solve such problems, the researchers introduced different monitoring tools, for example, in rotor fault diagnosis, in addition to the traditional vibration monitoring, the temperature field monitoring by infrared image can better solve the concurrent or coupling faults [36, 37]. Given this background, the rotor fault experiment is conducted in the paper, and the data of both vibration and infrared image are collected for fault diagnosis. States of normal and 6 faults are simulated, including coupling faults difficult to solve by traditional methods.

4.1. Experimental Setup

The testbed is built in the laboratory to simulate different working states of rotating machinery and to collect data. The experimental hardware includes the ZT-3 rotor test bed, FLIR E50 infrared thermal camera, and MDES fault diagnosis system. Besides, the computer and signal cables are included. Figure 4 illustrates the distribution of the system.

The ZT-3 rotor testbed is composed of a governor, base, motor, coupling, and dual-rotor system. The rotor system consists of a rotating shaft, rotor, bearing, coupling, and bearing bracket. In the experiment, the motor speed is 6000 rpm.

The vibration signal is collected by the MDES fault diagnosis system, which includes a computer, acceleration sensor, and multichannel vibration signal acquisition instrument. As shown in Figure 5, the measurement points are chosen at four bearing supports, which are denoted as V1, V2, V3, and V4 from the motor end, respectively. At each measurement point, signals are collected in both vertical and horizontal directions. The sampling frequency of each vibration signal is 20 kHz, and the sampling length is 20,000 points.

Infrared images are collected by the FLIR E50 infrared thermal imager. During the experiment, the infrared thermal imager is fixed on a tripod to ensure that all infrared images are collected under the same condition.

7 states are simulated in the experiment, including normal state (NS), imbalance (IB), misalignment (MA), rub impact (RI), bearing set loose (BSL), coupling faults of rub impact and misalignment (CFRM), and coupling faults of bearing set loose and misalignment (CFBM). 40 images and 40 sets of vibration data in each channel are collected at each state, of which 20 datasets are used for training and the remaining data are used for testing. The experiment was conducted under the Keras deep learning framework and hardware with an I9-9700 CPU and a RTX2080ti GPU.

4.2. Data Processing

In the experiment, firstly, the vibration and infrared data in rotating equipment monitoring are collected. Then, feature extraction is performed on the infrared image and vibration signal, respectively. The correlation coefficients of each infrared and vibration signal are calculated, and the composite correlation coefficient matrix is constructed for information fusion and converted into gray image. Finally, the deep learning model based on CNN is used for training and classification to realize fault diagnosis.

The infrared images are processed in accordance with the method in literature [37]. In this method, the infrared images captured from the thermal camera are firstly segmented into rectangular regions of different divisions, and regions which are sensitive to faults are then picked out by dispersion degree criterion; one specific region corresponds to a relevant independent fault. After processing the infrared image by this method, four regions of interest (ROIs) in the infrared images are selected, as shown in Figure 6. These four regions show greater differences at different fault states, so the background interference is eliminated and key fault information is retained by using ROI for fault diagnosis. Since each ROI represents the temperature change at a certain range, they can be regarded as four independent data sources when these four ROIs are extracted from the image. Therefore, these four ROIs are taken as data collected by four sensors.

Histogram features of each ROI are calculated as the characteristic values of four temperature sensors. The calculation of gray histogram information refers to equation (9), and the equations for calculating histogram features are shown in Table 1.where represents the gray level, P() represents the number of pixel points with a gray level of in the image, T represents the total number of pixels in the image, and N represents the maximum value of the gray level in the image.

Similar to the infrared image feature extraction method, in order to simplify the analysis and avoid the differences introduced by the complex algorithm, the vibration data feature values obtained by the 8 vibration sensors are calculated by the commonly used nondimensional indicators in the time domain, as shown in Table 2.

A 12 × 6 eigenvalue matrix is constructed by the four temperature eigenvectors and eight vibration eigenvectors. The matrix is calculated according to equation (6), and the composite correlation coefficient matrix is obtained. Take a set of correlation coefficient matrix at normal state as an example, as shown in Table 3.

It can be seen from Table 3 that the vibration signals are highly correlated from the sensors in two directions at the same measuring point, and the signals from sensors in the same direction at different measuring points are also highly correlated. The correlation of infrared data from each ROI is relatively high, and the correlation between infrared and vibration signals is relatively low, which is in accordance with the situation in signal collection. The correlation coefficient matrix obtained at six fault states is visualized by image generation and illustrated as in Figure 7.

In Figure 7, the brightness of the gray-scale image represents the correlation degree. Higher brightness refers to higher correlation degree, and lower brightness represents lower correlation degree. It can be seen from Figure 7 that colors change differently with different faults. Therefore, through correlation analysis, it can be seen that the correlation between relevant monitoring parameters changes at different states of the equipment, and the fault can be diagnosed according to the changes.

4.3. Result Analysis

In this classification experiment, 20 sets of composite correlation coefficient matrix images are randomly selected as the training data at each state of rotor system, and the remaining 20 sets of images are taken as test data, which means that 140 sets of images form the training set and the remaining 140 sets form the test set. According to the image resolution, the structure of CNN network is shown in Figure 8; the number of convolution cores in the first and second convolutional layers is 16 and 32 with the size of 3fe and 2 × 2, respectively. The size of the pooling layer is 2 × 2.

The classification result after 300 times of training is shown in Figure 9, and the accuracy in the test is 99.29%.

In this case, based on the diagnosis model in this paper, Pearson, Spearman, and complex correlation coefficient matrices are applied for fault diagnosis simultaneously, replacing the proposed composite feature coefficient matrix. Moreover, in order to compare with the traditional methods, after feature extraction, 4 temperature feature vectors and 8 vibration feature vectors are directly combined, and the traditional BP, SVM, and KNN are used for fault diagnosis. The results are shown in Table 4.

It can be seen from Table 4 that in the case of multisensor data acquisition, due to the high eigenvector dimension, the diagnosis result is not ideal by using the traditional feature-level fusion and classification model, when there are many fault types, especially when complex faults exist. Besides, the proposed method avoids the shortcoming of single correlation analysis, makes comprehensive use of all effective information, and gets the satisfied diagnosis results by combining the advantages of the deep learning method.

5. Conclusions

In this paper, a new fault diagnosis method based on correlation analysis and deep learning is proposed. A new composite correlation analysis method is established to perform feature-level fusion of sensor data from different sources, and then the correlation coefficient matrix image is processed directly by CNN algorithm to complete fault diagnosis. Through case study, the following conclusions are drawn:(1)The changes of equipment state can be represented through the correlation analysis of multiple sensors, and multisource information fusion is carried out. It can reduce the data dimension, improve the computing efficiency, and avoid the loss of fault information caused by direct comparison or normalization between data of different data types and different orders of magnitude.(2)The new correlation analysis method is built, which integrates the advantages of several correlation analysis methods, is suitable for heterogeneous sensor data with different distributions, and influences relationships, so as to obtain better results in fault diagnosis.(3)By constructing the fault diagnosis method combining the correlation analysis and deep learning model, the images are trained and identified directly, which are transformed from the correlation matrix of the monitoring data. Compared with the traditional method, the model is simplified and the fault diagnosis accuracy is higher.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors acknowledge the financial support provided by the National Natural Science Foundation of China (51975038 and 51605023), Beijing Municipal Natural Science Foundation, China (19L00001), General Project of Scientific Research Program of Beijing Education Commission (KM202010016003 and SQKM201810016015), Postdoctoral Science Foundation of Beijing, China (ZZ2019-98), Support Plan for the Construction of High-Level Teachers in Beijing Municipal Universities (CIT&TCD201904062 and CIT&TCD201704052), Scientific Research Fund of Beijing University of Civil Engineering Architecture (00331615015), BUCEA Post Graduate Innovation Project (PG2019092), and Fundamental Research Funds for Beijing University of Civil Engineering and Architecture (X18133).