Simultaneously learning affinity matrix and data representations for machine fault diagnosis

doi:10.1016/j.neunet.2019.11.007

Neural Networks

Volume 122, February 2020, Pages 395-406

https://doi.org/10.1016/j.neunet.2019.11.007 Get rights and content

Abstract

Recently, preserving geometry information of data while learning representations have attracted increasing attention in intelligent machine fault diagnosis. Existing geometry preserving methods require to predefine the similarities between data points in the original data space. The predefined affinity matrix, which is also known as the similarity matrix, is then used to preserve geometry information during the process of representations learning. Hence, the data representations are learned under the assumption of a fixed and known prior knowledge, i.e., similarities between data points. However, the assumed prior knowledge is difficult to precisely determine the real relationships between data points, especially in high dimensional space. Also, using two separated steps to learn affinity matrix and data representations may not be optimal and universal for data classification. In this paper, based on the extreme learning machine autoencoder (ELM-AE), we propose to learn the data representations and the affinity matrix simultaneously. The affinity matrix is treated as a variable and unified in the objective function of ELM-AE. Instead of predefining and fixing the affinity matrix, the proposed method adjusts the similarities by taking into account its capability of capturing the geometry information in both original data space and non-linearly mapped representation space. Meanwhile, the geometry information of original data can be preserved in the embedded representations with the help of the affinity matrix. Experimental results on several benchmark datasets demonstrate the effectiveness of the proposed method, and the empirical study also shows it is an efficient tool on machine fault diagnosis.

Introduction

Machine fault diagnosis, as a critical component of predictive maintenance, aims to monitor machine health conditions from collected sensor data, e.g., using vibration data collected by accelerometers to determine the health condition of motors (Li, Chow, Tipsuwan, & Hung, 2000). The condition monitoring system, with multiple sensors, is designed to collect real-time data and inspect the health conditions of the machines, and the system acquires a large amount of data. Generally, the sensory data is collected much faster than it is analyzed manually by diagnosticians. Therefore, machine learning techniques are widely used to build intelligent fault diagnosis systems and monitor machine health conditions.

Machine learning techniques require a carefully feature extraction before they can well monitor machine health conditions. Data representations, also known as features, are extracted from sensory data to remove the redundant information and retain the relevant information. For example, Samanta and Al-Balushi (2003) extracted statistical features, e.g., root mean square, kurtosis, and skewness, from time-domain sensory data. Moreover, Al-Badour, Sunar, and Cheded (2011) used the wavelet transform to extract features from the time–frequency domain of original data, and Zhong, Wong, and Yang (2018) used intrinsic mode functions, which are extracted by using empirical mode decomposition, as the features for further machine health conditions monitoring. However, manual feature extraction requires expert domain knowledge and can be labor-intensive.

Representation learning techniques are developed to overcome the weakness of manual feature extraction since they are designed to learn data representations automatically from raw data. For example, the convolutional neural network (CNN) (Krizhevsky, Sutskever, & Hinton, 2012) applies multiple learnable filters on the data points to learn data representations. Ince, Kiranyaz, Eren, Askar, and Gabbouj (2016) applied a 1-D CNN to learn data representations and monitor the machine health conditions concurrently. Autoencoder (AE) (Rumelhart, Hinton, & Williams, 1985) learns data representations by reconstructing the original data points from its embedded representation. Hierarchical representations can be obtained by stacking AEs to form a deep structure, stacked AE (SAE). Lu, Wang, Qin, and Ma (2017) used SAE and the stacked denoising autoencoder (SDA) to learn data representations from the original vibration data of motors. These algorithms do not consider the geometry information, which is an important property of representations, between data points.

The graph-based representation learning method is one of the most common methods that taking advantage of geometry information while learning representations. Laplacian eigenmaps (LE) (Belkin & Niyogi, 2002) firstly construct an affinity matrix based on the Euclidean distance between data points and their neighbors. The affinity matrix is then used to minimize the distances between each data point and its neighbors in the representation space. Li, Liyanaarachchi Lekamalage, Liu, Chen, and Huang (2019) proposed a novel representation learning algorithms named as the fast autoencoder with the local and global penalties (FAE-LG), which can preserve the local and global geometries of original data points from vibration data of motors. Although LE and FAE-LG preserve the geometry information in data representations, it requires a predefined and fixed affinity matrix under an assumed prior knowledge, e.g., uses a $k$ -nearest neighbor graph with binary or Gaussian edge weights to represent the geometry relationship between data points. However, the assumed prior knowledge might not precisely represent the real geometry relationships between data points. Also, the potential relationship between the affinity matrix and the classes are not fully exploited since the affinity matrix is constructed independently of the following tasks. Furthermore, the existing deep-learning-based algorithms require an iterative training procedure, which is time-consuming.

One way to address the weakness of the predefined affinity matrix is to treat it as a variable and learn it in the training process. Nie et al. proposed the adaptive graph learning method in a clustering method (Nie, Wang, & Huang, 2014). Instead of manually constructing the affinity matrix, the clustering with adaptive neighbors (CAN) (Nie et al., 2014) adaptively adjusts it during the clustering procedure. CAN is then extended to the projected clustering with adaptive neighbors (PCAN) (Nie et al., 2014). PCAN adjusts the affinity matrix based on the linearly projected data points and forces it suitable for the clustering task. Furthermore, Liu, Lekamalage, Huang, and Lin (2018) adaptively adjusted the affinity matrix based on the non-linear data embeddings obtained by an ELM-based algorithm. Inspired by the adaptive neighbor techniques used in the above studies, we can learn the affinity matrix and the non-linear data embeddings simultaneously in representation learning methods.

The extreme learning machine autoencoder (ELM-AE) (Kasun et al., 2016, Kasun et al., 2013) is proposed to reduce the training time and improve the efficiency of representation learning algorithms. ELM-AE, which learns data representations based on singular values, is an extension of ELM. ELM introduced by Huang, Huang et al., 2015, Huang et al., 2006 and Huang, Zhu, Siew, et al. (2004) is a single layer feed-forward neural network (SLFN) with randomly generated and fixed hidden neurons. ELM is well known for its efficient learning procedure and excellent generalization capability. Therefore, various extensions of ELM has been developed (Cao et al., 2016, Deng et al., 2016, Huang, Liu et al., 2015, Mirza and Lin, 2016). As one of the extensions of ELM, ELM-AE utilizes the advantages of random hidden neurons and can embed the original data into another representation space efficiently. Moreover, multiple ELM-AEs can be stacked together to learn hierarchical representations. Cui, Huang, Kasun, Zhang, and Han (2017) stacked multiple ELM-AEs to build an extreme learning machine network (ELMNet) and achieved promising performance on the handwritten dataset. In this study, we utilize the high efficiency of ELM-AE to develop a new representation learning algorithm.

In this paper, we propose a novel representation learning algorithm, which is named as the locality-preserving extreme learning machine autoencoder with adaptive neighbors (LELMAE-AN). The proposed method consists of two learning parts, which are the affinity matrix learning and representation learning. In the affinity matrix learning, it learns an affinity matrix with its elements represents the similarities of all pairwise data samples. It penalties the pairwise samples with a far Euclidean distance in both original data space and the embedded representation space. Therefore, the value of similarities is forced to be large if two data samples are close to each other and small if they are far away. In representation learning, it learns the data representation by minimizing the error between the reconstructed data and the original input data. Since both of the affinity matrix learning and representation learning parts are convex, they can be optimized efficiently.

In summary, the contributions of this paper are as follows.

(1)
We propose a unified objective function that can concurrently learn data representations and the affinity matrix. Moreover, we propose a soft discrimination constraint, which is optimized together with the objective function. The constraint utilizes the label information to obtain the discriminative affinity matrix and representations, and the soft constraint prevents over-fitting compared to the hard constraint used in other affinity matrix learning methods.
(2)
In the proposed method, the affinity matrix is learned by utilizing the geometry information of both original data points and their representations. The assumption is that the pairwise data points with a smaller distance in both original data space and embedded representation space should have a higher similarity; thus, the geometry structure of original data can be preserved in representations. Additionally, the label information forces the data points within the same class to have a higher similarity than in different classes.
(3)
The proposed method learns non-linear representations in fast learning speed and excellent generalization capability. By using ELM-AE, our method can approximate non-linear functions explicitly and obtains non-linear data representations together with the capability to learn the affinity matrix.
(4)
We experimentally demonstrate that jointly learning the affinity matrix and representations can improve the performance of machine fault diagnosis.

The rest of this paper is organized as follows. Section 2 briefly describes related methods for the representation learning and the affinity matrix learning. Section 3 presents the details of the proposed method. The experimental designs and results are discussed in Section 4, and this paper concludes in Section 5. The notations used in this paper are summarized in Table 1.

Section snippets

Preliminaries

In this section, we describe related methods in representation learning and affinity matrix learning. Let the unlabeled training set with $N$ samples be $X$ , where $X = [x_{1}, x_{2}, \dots, x_{N}] \in R^{d \times N}$ .

Locality-preserving extreme learning machine autoencoder with adaptive neighbors

In this section, we formulate the objective function of the proposed LELMAE-AN. We then propose an optimization solution for the proposed objective function. There are two learning tasks in the objective function, which are the affinity matrix learning and representation learning. The affinity matrix learning aims to learn an affinity matrix, $S$ , that contains geometry information and label information. The representation learning aims to learn a projection matrix, $β$ , that used to embed input

Experimental results and discussions

We compared the proposed LELMAE-AN on various real-world benchmark datasets in comparison with state-of-the-art algorithms. Firstly, the proposed method was tested on six standard machine learning datasets, which are four UCI datasets and two objective recognition image datasets, to test its generalization. After that, we also applied the proposed method on the machine fault diagnosis task, i.e., the method was tested on the CWRU bearing vibration dataset. The details of the datasets are

Conclusion

In this paper, a novel representation learning method, which is named LELMAE-AN, is proposed to improve the performance of machine fault diagnosis tasks. The proposed method exploited the geometry information from both data and representation space by using a tunable affinity matrix. The affinity matrix also obtained the discriminative information by a soft discrimination constraint. The input data could be efficiently mapped to non-linear representations by using ELM-AE based method that is

References (34)

Al-BadourF. et al.
Vibration analysis of rotating machinery using time–frequency analysis and wavelet techniques
Mechanical Systems and Signal Processing
(2011)
CaoJ. et al.
Extreme learning machine and adaptive sparse representation for image classification
Neural Networks
(2016)
DengW.-Y. et al.
A fast svd-hidden-nodes based extreme learning machine for large-scale data analytics
Neural Networks
(2016)
DuW. et al.
Wavelet leaders multifractal features based fault diagnosis of rotating mechanism
Mechanical Systems and Signal Processing
(2014)
HuangG. et al.
Trends in extreme learning machines: A review
Neural Networks
(2015)
HuangG. et al.
Discriminative clustering via extreme learning machine
Neural Networks
(2015)
HuangG.-B. et al.
Extreme learning machine: theory and applications
Neurocomputing
(2006)
LiY. et al.
Learning representations with local and global geometries preserved for machine fault diagnosis
IEEE Transactions on Industrial Electronics
(2019)
LiuT. et al.
An adaptive graph learning method based on dual data representations for clustering
Pattern Recognition
(2018)
LouX. et al.
Bearing fault diagnosis based on wavelet transform and fuzzy inference
Mechanical Systems and Signal Processing
(2004)

LuC. et al.

Fault diagnosis of rotary machinery components using a stacked denoising autoencoder-based health state identification

Signal Processing

(2017)

MirzaB. et al.

Meta-cognitive online sequential extreme learning machine for imbalanced and concept-drifting data classification

Neural Networks

(2016)

SamantaB. et al.

Artificial neural network based fault diagnostics of rolling element bearings using time-domain features

Mechanical Systems and Signal Processing

(2003)

SunK. et al.

Generalized extreme learning machine autoencoder and a new deep neural network

Neurocomputing

(2017)

ZhangX. et al.

A novel bearing fault diagnosis model integrated permutation entropy, ensemble empirical mode decomposition and optimized svm

Measurement

(2015)

ZhongJ.-H. et al.

Fault diagnosis of rotating machinery based on multiple probabilistic classifiers

Mechanical Systems and Signal Processing

(2018)

BelkinM. et al.

Laplacian eigenmaps and spectral techniques for embedding and clustering

Advances in Neural Information Processing Systems

(2002)

Cited by (7)

A hybrid framework based on extreme learning machine, discrete wavelet transform, and autoencoder with feature penalty for stock prediction
2022, Expert Systems with Applications
Citation Excerpt :
It is widely used in classification, regression problems (Y. Chen, Xie, Zhang, Bai, & Hou, 2020; Z. Liu, Jin, & Mu, 2020), clustering, feature selection, and representation learning (G. Huang, Song, Gupta, & Wu, 2014). More information about ELM has been extensively provided by other scholars (Alaba et al., 2019; Alade, Selamat, & Sallehuddin, 2017; J. Chen, Zeng, Li, & Huang, 2020; W. Deng, Zheng, & Chen, 2009; Ding, Zhao, Zhang, Xu, & Nie, 2015; Y. Li, Zeng, Liu, Jia, & Huang, 2020; Zeng, Chen, Li, Qing, & Huang, 2020; Zeng, Li, Chen, Jia, & Huang, 2020; J. Zhang, Li, Xiao, & Zhang, 2020b). After selecting an excellent machine learning model suitable for application scenarios, the data need to be processed for model training.
Accurate prediction of the stock market trend can assist efficient portfolio and risk management. In recent years, with the rapid development of deep learning, it can make the classifiers more robust, which can be used for solving nonlinear problems. In our previous research, we proposed a numerical model for predicting the stock market via combination of discrete wavelet transform (DWT) denoising and extreme learning machine (ELM), and promising outcomes are achieved. The current research presents a hybrid framework using DWT, ELM, and autoencoder (AE) with feature penalty. Firstly, the backpropagation of the AE with feature penalty was deduced theoretically. Then, the raw data were denoised by DWT. The denoised data were used to train the AE with feature penalty after feature preprocessing and utilization of the labeling method. Afterward, the encoder part of the well-trained AE was utilized as the feature extraction model to train ELM model, and the hybrid framework named DAELM (DWT-AE-ELM) could be successfully developed. We also carried out experiments on the corresponding dataset of 400 stocks, and the prediction accuracy of the current study was higher than that of our previous research. According to the predicted labels, we presented an investment strategy, and the yield-to-maturity of 400 stocks was significantly higher than that of the buy-and-hold (BAH) strategy. The results confirmed the superiority of the proposed hybrid framework.
Efficient joint model learning, segmentation and model updating for visual tracking
2022, Neural Networks
The Tracking-by-segmentation framework is widely used in visual tracking to handle severe appearance change such as deformation and occlusion. Tracking-by-segmentation methods first segment the target object from the background, then use the segmentation result to estimate the target state. In existing methods, target segmentation is formulated as a superpixel labeling problem constrained by a target likelihood constraint, a spatial smoothness constraint and a temporal consistency constraint. The target likelihood is calculated by a discriminative part model trained independently from the superpixel labeling framework and updated online using historical tracking results as pseudo-labels. Due to the lack of spatial and temporal constraints and inaccurate pseudo-labels, the discriminative model is unreliable and may lead to tracking failure. This paper addresses the aforementioned problems by integrating the objective function of model training into the target segmentation optimization framework. Thus, during the optimization process, the discriminative model can be constrained by spatial and temporal constraints and provides more accurate target likelihoods for part labeling, and the results produce more reliable pseudo-labels for model learning. Moreover, we also propose a supervision switch mechanism to detect erroneous pseudo-labels caused by a severe change in data distribution and switch the classifier to a semi-supervised setting in such a case. Evaluation results on OTB2013, OTB2015 and TC-128 benchmarks demonstrate the effectiveness of the proposed tracking algorithm.
Residual wide-kernel deep convolutional auto-encoder for intelligent rotating machinery fault diagnosis with limited samples
2021, Neural Networks
Citation Excerpt :
Therefore, it is necessary to monitor and control the operating conditions of rotating machinery. Moreover, how a fast and effective method can be developed for detecting faults in bearings and gears is still an essential issue in the modern industry (Klausen, Robbersmyr, & Karimi, 2017; Li, Xiang, Wang, Zhou, & Tang, 2018; Li, Zeng, Liu, Jia, & Huang, 2020; Yin, Wang, & Karimi, 2014; Zarei, Tajeddini, & Karimi, 2014). Hence, many kinds of artificial intelligent algorithms were applied in the problem of rotating machinery fault detection (Zhang et al., 2020; Zhao & Lai, 2019; Zhao, Lai and Chen, 2019), including machine learning algorithms (e.g. artificial neural networks Gangsar & Tiwari, 2020; Yu, Junsheng, et al., 2006, support vector machine (SVM) Liu, Bo, & Luo, 2015; Wu, Yin, & Karimi, 2014, k-nearest neighbors Wang, 2016 and logic regression Gangsar & Tiwari, 2020) and deep learning algorithms (e.g. convolutional neural networks (CNN) Jiang, He, Yan, & Xie, 2018, recurrent neural network (RNN) Li et al., 2018; Miao, Li, Sun, & Liu, 2019; Zhao et al., 2017, deep auto-encoders (DAE) Haidong, Hongkai, Ke, Dongdong, & Xingqiu, 2018; Sun et al., 2018, and deep belief network (DBN) Yu & Liu, 2020) depending on their superior properties.
This paper deals with the development of a novel deep learning framework to achieve highly accurate rotating machinery fault diagnosis using residual wide-kernel deep convolutional auto-encoder. Unlike most existing methods, in which the input data is processed by fast Fourier transform (FFT) and wavelet transform, this paper aims to learn important features from limited raw vibration signals. Firstly, the wide-kernel convolutional layer is introduced in the convolutional auto-encoder that can ensure the model can learn effective features from the data without any signal processing. Secondly, the residual learning block is introduced in convolutional auto-encoder that can ensure the model with sufficient depth without gradient vanishing and overfitting problems. Thirdly, convolutional auto-encoder can learn constructive features without massive data. To evaluate the performance of the proposed model, Case Western Reserve University (CWRU) bearing dataset and Southeast University (SEU) gearbox dataset are used to test. The experiment results and comparisons verify the denoising and feature extraction ability of the proposed model in the case of very few training samples.
Partial transfer learning in machinery cross-domain fault diagnostics using class-weighted adversarial networks
2020, Neural Networks
Citation Excerpt :
Intelligent machinery fault diagnosis has been attracting increasing attention in the past years (Li, Zeng, Liu, Jia, & Huang, 2020; Liu et al., 2020; Luo, Wang, Tang, & Wang, 2019; Yu, Lin, Ma, Li, & Zeng, 2019; Zhao & Lai, 2019; Zhao, Lai, & Chen, 2019), due to the remarkable merits of high accuracy, fast response, easy implementation etc.
Recently, transfer learning has been receiving growing interests in machinery fault diagnosis due to its strong generalization across different industrial scenarios. The existing methods generally assume identical label spaces, and propose minimizing marginal distribution discrepancy between source and target domains. However, this assumption usually does not hold in real industries, where testing data mostly contain a subspace of the source label space. Therefore, transferring diagnosis knowledge from a comprehensive source domain to a target domain with limited machine conditions is motivated. This challenging partial transfer learning problem is addressed in this study using deep learning-based domain adaptation method. A class weighted adversarial neural network is proposed to encourage positive transfer of the shared classes and ignore the source outliers. Experimental results on two rotating machinery datasets suggest the proposed method is promising for partial transfer learning.
A hybrid method based on extreme learning machine and wavelet transform denoising for stock prediction
2021, Entropy
A physical knowledge-based extreme learning machine approach to fault diagnosis of rolling element bearing from small datasets
2020, UbiComp/ISWC 2020 Adjunct - Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers

View all citing articles on Scopus

View full text

2019 Special IssueSimultaneously learning affinity matrix and data representations for machine fault diagnosis

Abstract

Introduction

Section snippets

Preliminaries

Locality-preserving extreme learning machine autoencoder with adaptive neighbors

Experimental results and discussions

Conclusion

Mechanical Systems and Signal Processing

Neural Networks

Neural Networks

Mechanical Systems and Signal Processing

Neural Networks

Neural Networks

Neurocomputing

IEEE Transactions on Industrial Electronics

Pattern Recognition

Mechanical Systems and Signal Processing

Signal Processing

Neural Networks

Mechanical Systems and Signal Processing

Neurocomputing

Measurement

Mechanical Systems and Signal Processing

Laplacian eigenmaps and spectral techniques for embedding and clustering

Advances in Neural Information Processing Systems

2019 Special Issue
Simultaneously learning affinity matrix and data representations for machine fault diagnosis