Elsevier

Neural Networks

Volume 122, February 2020, Pages 395-406
Neural Networks

2019 Special Issue
Simultaneously learning affinity matrix and data representations for machine fault diagnosis

https://doi.org/10.1016/j.neunet.2019.11.007Get rights and content

Abstract

Recently, preserving geometry information of data while learning representations have attracted increasing attention in intelligent machine fault diagnosis. Existing geometry preserving methods require to predefine the similarities between data points in the original data space. The predefined affinity matrix, which is also known as the similarity matrix, is then used to preserve geometry information during the process of representations learning. Hence, the data representations are learned under the assumption of a fixed and known prior knowledge, i.e., similarities between data points. However, the assumed prior knowledge is difficult to precisely determine the real relationships between data points, especially in high dimensional space. Also, using two separated steps to learn affinity matrix and data representations may not be optimal and universal for data classification. In this paper, based on the extreme learning machine autoencoder (ELM-AE), we propose to learn the data representations and the affinity matrix simultaneously. The affinity matrix is treated as a variable and unified in the objective function of ELM-AE. Instead of predefining and fixing the affinity matrix, the proposed method adjusts the similarities by taking into account its capability of capturing the geometry information in both original data space and non-linearly mapped representation space. Meanwhile, the geometry information of original data can be preserved in the embedded representations with the help of the affinity matrix. Experimental results on several benchmark datasets demonstrate the effectiveness of the proposed method, and the empirical study also shows it is an efficient tool on machine fault diagnosis.

Introduction

Machine fault diagnosis, as a critical component of predictive maintenance, aims to monitor machine health conditions from collected sensor data, e.g., using vibration data collected by accelerometers to determine the health condition of motors (Li, Chow, Tipsuwan, & Hung, 2000). The condition monitoring system, with multiple sensors, is designed to collect real-time data and inspect the health conditions of the machines, and the system acquires a large amount of data. Generally, the sensory data is collected much faster than it is analyzed manually by diagnosticians. Therefore, machine learning techniques are widely used to build intelligent fault diagnosis systems and monitor machine health conditions.

Machine learning techniques require a carefully feature extraction before they can well monitor machine health conditions. Data representations, also known as features, are extracted from sensory data to remove the redundant information and retain the relevant information. For example, Samanta and Al-Balushi (2003) extracted statistical features, e.g., root mean square, kurtosis, and skewness, from time-domain sensory data. Moreover, Al-Badour, Sunar, and Cheded (2011) used the wavelet transform to extract features from the time–frequency domain of original data, and Zhong, Wong, and Yang (2018) used intrinsic mode functions, which are extracted by using empirical mode decomposition, as the features for further machine health conditions monitoring. However, manual feature extraction requires expert domain knowledge and can be labor-intensive.

Representation learning techniques are developed to overcome the weakness of manual feature extraction since they are designed to learn data representations automatically from raw data. For example, the convolutional neural network (CNN) (Krizhevsky, Sutskever, & Hinton, 2012) applies multiple learnable filters on the data points to learn data representations. Ince, Kiranyaz, Eren, Askar, and Gabbouj (2016) applied a 1-D CNN to learn data representations and monitor the machine health conditions concurrently. Autoencoder (AE) (Rumelhart, Hinton, & Williams, 1985) learns data representations by reconstructing the original data points from its embedded representation. Hierarchical representations can be obtained by stacking AEs to form a deep structure, stacked AE (SAE). Lu, Wang, Qin, and Ma (2017) used SAE and the stacked denoising autoencoder (SDA) to learn data representations from the original vibration data of motors. These algorithms do not consider the geometry information, which is an important property of representations, between data points.

The graph-based representation learning method is one of the most common methods that taking advantage of geometry information while learning representations. Laplacian eigenmaps (LE) (Belkin & Niyogi, 2002) firstly construct an affinity matrix based on the Euclidean distance between data points and their neighbors. The affinity matrix is then used to minimize the distances between each data point and its neighbors in the representation space. Li, Liyanaarachchi Lekamalage, Liu, Chen, and Huang (2019) proposed a novel representation learning algorithms named as the fast autoencoder with the local and global penalties (FAE-LG), which can preserve the local and global geometries of original data points from vibration data of motors. Although LE and FAE-LG preserve the geometry information in data representations, it requires a predefined and fixed affinity matrix under an assumed prior knowledge, e.g., uses a k-nearest neighbor graph with binary or Gaussian edge weights to represent the geometry relationship between data points. However, the assumed prior knowledge might not precisely represent the real geometry relationships between data points. Also, the potential relationship between the affinity matrix and the classes are not fully exploited since the affinity matrix is constructed independently of the following tasks. Furthermore, the existing deep-learning-based algorithms require an iterative training procedure, which is time-consuming.

One way to address the weakness of the predefined affinity matrix is to treat it as a variable and learn it in the training process. Nie et al. proposed the adaptive graph learning method in a clustering method (Nie, Wang, & Huang, 2014). Instead of manually constructing the affinity matrix, the clustering with adaptive neighbors (CAN) (Nie et al., 2014) adaptively adjusts it during the clustering procedure. CAN is then extended to the projected clustering with adaptive neighbors (PCAN) (Nie et al., 2014). PCAN adjusts the affinity matrix based on the linearly projected data points and forces it suitable for the clustering task. Furthermore, Liu, Lekamalage, Huang, and Lin (2018) adaptively adjusted the affinity matrix based on the non-linear data embeddings obtained by an ELM-based algorithm. Inspired by the adaptive neighbor techniques used in the above studies, we can learn the affinity matrix and the non-linear data embeddings simultaneously in representation learning methods.

The extreme learning machine autoencoder (ELM-AE) (Kasun et al., 2016, Kasun et al., 2013) is proposed to reduce the training time and improve the efficiency of representation learning algorithms. ELM-AE, which learns data representations based on singular values, is an extension of ELM. ELM introduced by Huang, Huang et al., 2015, Huang et al., 2006 and Huang, Zhu, Siew, et al. (2004) is a single layer feed-forward neural network (SLFN) with randomly generated and fixed hidden neurons. ELM is well known for its efficient learning procedure and excellent generalization capability. Therefore, various extensions of ELM has been developed (Cao et al., 2016, Deng et al., 2016, Huang, Liu et al., 2015, Mirza and Lin, 2016). As one of the extensions of ELM, ELM-AE utilizes the advantages of random hidden neurons and can embed the original data into another representation space efficiently. Moreover, multiple ELM-AEs can be stacked together to learn hierarchical representations. Cui, Huang, Kasun, Zhang, and Han (2017) stacked multiple ELM-AEs to build an extreme learning machine network (ELMNet) and achieved promising performance on the handwritten dataset. In this study, we utilize the high efficiency of ELM-AE to develop a new representation learning algorithm.

In this paper, we propose a novel representation learning algorithm, which is named as the locality-preserving extreme learning machine autoencoder with adaptive neighbors (LELMAE-AN). The proposed method consists of two learning parts, which are the affinity matrix learning and representation learning. In the affinity matrix learning, it learns an affinity matrix with its elements represents the similarities of all pairwise data samples. It penalties the pairwise samples with a far Euclidean distance in both original data space and the embedded representation space. Therefore, the value of similarities is forced to be large if two data samples are close to each other and small if they are far away. In representation learning, it learns the data representation by minimizing the error between the reconstructed data and the original input data. Since both of the affinity matrix learning and representation learning parts are convex, they can be optimized efficiently.

In summary, the contributions of this paper are as follows.

  • (1)

    We propose a unified objective function that can concurrently learn data representations and the affinity matrix. Moreover, we propose a soft discrimination constraint, which is optimized together with the objective function. The constraint utilizes the label information to obtain the discriminative affinity matrix and representations, and the soft constraint prevents over-fitting compared to the hard constraint used in other affinity matrix learning methods.

  • (2)

    In the proposed method, the affinity matrix is learned by utilizing the geometry information of both original data points and their representations. The assumption is that the pairwise data points with a smaller distance in both original data space and embedded representation space should have a higher similarity; thus, the geometry structure of original data can be preserved in representations. Additionally, the label information forces the data points within the same class to have a higher similarity than in different classes.

  • (3)

    The proposed method learns non-linear representations in fast learning speed and excellent generalization capability. By using ELM-AE, our method can approximate non-linear functions explicitly and obtains non-linear data representations together with the capability to learn the affinity matrix.

  • (4)

    We experimentally demonstrate that jointly learning the affinity matrix and representations can improve the performance of machine fault diagnosis.

The rest of this paper is organized as follows. Section 2 briefly describes related methods for the representation learning and the affinity matrix learning. Section 3 presents the details of the proposed method. The experimental designs and results are discussed in Section 4, and this paper concludes in Section 5. The notations used in this paper are summarized in Table 1.

Section snippets

Preliminaries

In this section, we describe related methods in representation learning and affinity matrix learning. Let the unlabeled training set with N samples be X, where X=[x1,x2,,xN]Rd×N.

Locality-preserving extreme learning machine autoencoder with adaptive neighbors

In this section, we formulate the objective function of the proposed LELMAE-AN. We then propose an optimization solution for the proposed objective function. There are two learning tasks in the objective function, which are the affinity matrix learning and representation learning. The affinity matrix learning aims to learn an affinity matrix, S, that contains geometry information and label information. The representation learning aims to learn a projection matrix, β, that used to embed input

Experimental results and discussions

We compared the proposed LELMAE-AN on various real-world benchmark datasets in comparison with state-of-the-art algorithms. Firstly, the proposed method was tested on six standard machine learning datasets, which are four UCI datasets and two objective recognition image datasets, to test its generalization. After that, we also applied the proposed method on the machine fault diagnosis task, i.e., the method was tested on the CWRU bearing vibration dataset. The details of the datasets are

Conclusion

In this paper, a novel representation learning method, which is named LELMAE-AN, is proposed to improve the performance of machine fault diagnosis tasks. The proposed method exploited the geometry information from both data and representation space by using a tunable affinity matrix. The affinity matrix also obtained the discriminative information by a soft discrimination constraint. The input data could be efficiently mapped to non-linear representations by using ELM-AE based method that is

References (34)

Cited by (7)

  • A hybrid framework based on extreme learning machine, discrete wavelet transform, and autoencoder with feature penalty for stock prediction

    2022, Expert Systems with Applications
    Citation Excerpt :

    It is widely used in classification, regression problems (Y. Chen, Xie, Zhang, Bai, & Hou, 2020; Z. Liu, Jin, & Mu, 2020), clustering, feature selection, and representation learning (G. Huang, Song, Gupta, & Wu, 2014). More information about ELM has been extensively provided by other scholars (Alaba et al., 2019; Alade, Selamat, & Sallehuddin, 2017; J. Chen, Zeng, Li, & Huang, 2020; W. Deng, Zheng, & Chen, 2009; Ding, Zhao, Zhang, Xu, & Nie, 2015; Y. Li, Zeng, Liu, Jia, & Huang, 2020; Zeng, Chen, Li, Qing, & Huang, 2020; Zeng, Li, Chen, Jia, & Huang, 2020; J. Zhang, Li, Xiao, & Zhang, 2020b). After selecting an excellent machine learning model suitable for application scenarios, the data need to be processed for model training.

  • Residual wide-kernel deep convolutional auto-encoder for intelligent rotating machinery fault diagnosis with limited samples

    2021, Neural Networks
    Citation Excerpt :

    Therefore, it is necessary to monitor and control the operating conditions of rotating machinery. Moreover, how a fast and effective method can be developed for detecting faults in bearings and gears is still an essential issue in the modern industry (Klausen, Robbersmyr, & Karimi, 2017; Li, Xiang, Wang, Zhou, & Tang, 2018; Li, Zeng, Liu, Jia, & Huang, 2020; Yin, Wang, & Karimi, 2014; Zarei, Tajeddini, & Karimi, 2014). Hence, many kinds of artificial intelligent algorithms were applied in the problem of rotating machinery fault detection (Zhang et al., 2020; Zhao & Lai, 2019; Zhao, Lai and Chen, 2019), including machine learning algorithms (e.g. artificial neural networks Gangsar & Tiwari, 2020; Yu, Junsheng, et al., 2006, support vector machine (SVM) Liu, Bo, & Luo, 2015; Wu, Yin, & Karimi, 2014, k-nearest neighbors Wang, 2016 and logic regression Gangsar & Tiwari, 2020) and deep learning algorithms (e.g. convolutional neural networks (CNN) Jiang, He, Yan, & Xie, 2018, recurrent neural network (RNN) Li et al., 2018; Miao, Li, Sun, & Liu, 2019; Zhao et al., 2017, deep auto-encoders (DAE) Haidong, Hongkai, Ke, Dongdong, & Xingqiu, 2018; Sun et al., 2018, and deep belief network (DBN) Yu & Liu, 2020) depending on their superior properties.

  • Partial transfer learning in machinery cross-domain fault diagnostics using class-weighted adversarial networks

    2020, Neural Networks
    Citation Excerpt :

    Intelligent machinery fault diagnosis has been attracting increasing attention in the past years (Li, Zeng, Liu, Jia, & Huang, 2020; Liu et al., 2020; Luo, Wang, Tang, & Wang, 2019; Yu, Lin, Ma, Li, & Zeng, 2019; Zhao & Lai, 2019; Zhao, Lai, & Chen, 2019), due to the remarkable merits of high accuracy, fast response, easy implementation etc.

  • A physical knowledge-based extreme learning machine approach to fault diagnosis of rolling element bearing from small datasets

    2020, UbiComp/ISWC 2020 Adjunct - Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers
View all citing articles on Scopus
View full text