A consistency regularization based semi-supervised learning approach for intelligent fault diagnosis of rolling bearing

doi:10.1016/j.measurement.2020.107987

Measurement

Volume 165, 1 December 2020, 107987

https://doi.org/10.1016/j.measurement.2020.107987 Get rights and content

Highlights

•
A SSL method for intelligent fault diagnosis of rolling bearing is proposed.
•
Data augmentation is the basic foundation of the proposed method.
•
Two consistency terms make the trained network less sensitive to the extra perturbations.
•
The proposed method achieves excellent performance on the experimental study.

Abstract

Deep learning has been widely used nowadays to achieve an automated fault diagnosis of rolling bearings. However, most of deep learning based bearing fault diagnosis methods are based on the assumption that the recorded samples are labeled data, though most of field data are recorded without label information. To address this issue, an effective semi-supervised learning method based on the principle of consistency regularization is proposed in this study. The principle of consistency regularization underlines that the model predictions should be less sensitive to the extra perturbation imposed on the input samples. In the proposed method, a data augmentation method is proposed to serve as the extra perturbation, which is imposed on both the labeled and unlabeled samples to enrich the data library. Meanwhile, a label predicting process is formulated to estimate the appreciable label distribution for the unlabeled sample. Correspondingly, two consistency loss terms are introduced to regularize the model predictions for both the labeled and unlabeled samples to be invariant to the extra perturbation, among which a supervised loss term is adopted to enforce the model predictions for augmented labeled samples to be consistent with its true label information and an unsupervised loss is proposed to minimize the discrepancy between label distributions for the original unlabeled samples and its appreciable label distributions. The analysis result on an experimental bearing fault dataset demonstrates that the proposed method can provide an excellent identification performance under limited labeled samples situation.

Introduction

Rolling bearing is one of the most important mechanical components, which has been widely used in rotating machinery. Since the rolling bearing usually carries a large dynamic load in its service life, it is easy to cause an unattended failure on its surface [1], [2], [3], [4]. Their failure will directly affect the performance of the whole production line and further lead to a huge economic loss. It is thus essential to propose a bearing fault diagnosis method to accurately detect a bearing defect at the first hand [5], [6], [7].

Data-driven fault diagnosis methods are becoming a current research trend due to their high automatic level and fast implement rate [8], [9], [10]. Several data-driven bearing fault diagnosis methods have been proposed which can be generally classified into two categories, (1) shallow learning based diagnostic methods [11], [12], and (2) deep learning based diagnostic methods [13], [14], [15]. Shallow learning based diagnostic methods used none or just a couple hidden layers in the network structure which lack capacities in handling complicated fault classification cases. Moreover, a feature extraction step is essential in shallow learning based diagnostic methods, which increases the computational complexity and reduces the automatic level of these techniques [11], [12]. On the other hand, deep learning based diagnostic methods integrate the feature extraction and fault classification into an ensemble form to increase the accuracy in the fault diagnosis task, which can greatly address the issues in shallow learning based diagnostic methods and significantly improve the performance of the deep learning algorithms. Various deep architectures have been proposed in literatures to achieve the fault diagnostic applications [16], [17], [18], [19], [20], [21]. For examples, Kong et al. [16] integrated the multiple deep auto-encoder architectures into the ensemble learning framework to achieve the bearing fault diagnosis. Jing et al. [17] proposed a convolutional neural network (CNN) to directly extract the discriminative features from the Fourier data and realize the automatic gearbox fault diagnosis. Jiao et al. [18] proposed a deep coupled CNN architecture to fuse the feature streams from various information sources. Lei et al. [19] proposed a long short-term memory (LSTM) network to extract discriminative features from the multivariate time series with considering the long-term dependencies, and effectively achieve the fault diagnosis of wind turbine. An et al. [20] adopted an LSTM network to eliminate the effect of the nonstationary characteristic of the vibration data on learning performance.

Deep learning based diagnostic methods can effectively classify the fault conditions from the recorded data samples without the need of manual interventions. However, most of existing deep learning based diagnostic methods were established on the assumption that the recorded samples are labeled samples, which ignore the fact that there exist no label information for most of the recorded samples on account of that labeling a sample will takes more manpower and resources in industrial fields. Semi-supervised learning (SSL) methods which can make use of both labeled and unlabeled samples appear to be a powerful approach to address this issue. A series of SSL approaches have been proposed recently to address the limited labeled samples in industrial fields [22], [23], [24], [25], [26], [27]. For example, Zhang et al. [22] proposed a deep semi-supervised method of multiple association layers networks on the basis of the ladder network to achieve the fault diagnosis of planetary gearbox. Jiang et al. [23] tried to achieve the process monitoring with both limited labeled data and sufficient unlabeled data in a two-step manner. Razavi-Far et al. [24] integrated the statistic features from multiple sensory streams into a semi-supervised deep ladder network to achieve the gearbox fault diagnosis. A semi-supervised smooth alpha layering algorithm is proposed in Ref. [25] to realize the bearing fault identification under limited labeled samples situation. Li. et al. [26] proposed a semi-supervised gear fault diagnosis method using a combination of both augmented auto-encoder and augmented monitoring data. Liang et al. [27] transformed the one-dimensional monitoring data into two-dimensional time–frequency image and then put them into the semi-supervised generative adversarial network to achieve the fault diagnosis of rotating machinery. Though the above mentioned methods can improve the identification ability in addressing the SSL task, the tedious procedures adopted in the above mentioned methods make it unavailable to achieve the fault diagnosis task with a high efficiency.

Data augmentation (DA) is an effective approach to enrich the data library for the recorded data and it has been widely used in the fields of object recognition and image processing [28]. The basic idea of the data augmentation is to change the partial structure of the recorded data to generate more variant versions, whilst the label information for the recorded data remains unchanged [29], [30]. When the augmented data are introduced to update the parameters of the network, it can be viewed as an extra perturbation imposed on the current network. However, on account of the fact that the label information for the augmented data remains unchanged, the trained network is expected to be less sensitive to the deformation of the recorded data. This idea is an effective approach to improve the generalization ability of the network and also coincide with the principle of consistency regularization based SSL methods. In consistency regularization based SSL methods, an extra perturbation is firstly imposed on the input samples or the hidden states of the network architecture, and then a regularization term is introduced to make the model predictions invariant to the extra perturbation. Various SSL techniques have been proposed under this framework, such as virtual adversarial training (VAT) [31], MixMatch [32], unsupervised data augmentation (UDA) [33], ladder network [34] and mean teacher [35]. Among them, the main difference is how and where the extra perturbation is imposed. In VAT [31], an additive perturbation which can maximally change the output distribution is imposed on the input samples and the model prediction is enforced to be less sensitive to the imposed perturbation. In MixMatch [32], the random horizontal flips and crops are initially imposed on the input image data and then a MixUp is used to further mix the labeled and unlabeled samples. Corresponding, two consistency costs are designed for the labeled data and unlabeled data to regularize the model. In UDA [33], a consistency regularization framework is firstly proposed and then the effectiveness of various perturbation strategies is investigated. In ladder network [34], two parallel encoders are built for labeled and unlabeled data simultaneously. The encoder for labeled data is corrupted by adding isotropic Gaussian noise on its hidden states and the encoder for unlabeled data is noise free, a regularization term is formulated to enforce the consistency between the hidden states of these two encoders. In mean teacher [35], though various perturbations are imposed on the input samples, the output distributions from student model and teacher model are assumed to be consistent. In these literatures, most of the extra perturbations were designed for two-dimensional image data and cannot be adopted to achieve the bearing fault diagnosis task on account of that the bearing condition monitoring data is typically one-dimensional data. To address this issue, a new SSL method based on the principle of consistency regularization is proposed in this study to achieve the bearing fault diagnosis task under the limited labeled samples situation. In the proposed method, a DA method (DAM) designed for one-dimensional bearing fault data is proposed to impose the extra perturbation on the original samples, which can effectively enrich the data library for both labeled samples and unlabeled samples. Then, an appreciable label distribution for the unlabeled samples is formulated by using the model predictions for the augmented unlabeled samples. After that, two consistency loss terms are formulated to regularize the model predictions to be invariant to the extra perturbations. Among them, a supervised cross entropy loss is adopted to enforce the model predictions for augmented labeled samples to be consistent with the true label information for original labeled samples and an unsupervised consistency loss is formulated to minimize the discrepancy between label distributions for the original unlabeled sample and its appreciable label distributions. A schematic illustration of the proposed method is given in Fig. 1. In the proposed method, the proposed DAM can greatly improve the diversity of the original samples, which can lead to a high identification performance under limited labeled samples. Moreover, the consistency regularization principle on the unlabeled samples can help the classifier better detect the membership for unlabeled sample, which is beneficial to push the decision boundary far away from the high-density region of the marginal data distribution. The classification result on an experimental bearing dataset demonstrates that the proposed method can provide an accurate bearing fault diagnosis under the limited labeled samples situation.

The main contributions of this study are highlighted as follows:

1)
In this study, an effective SSL method based on the principle of consistency regularization is proposed for automatic bearing fault diagnosis under limited labeled samples situation. In the proposed method, a DAM designed for one-dimensional vibration data can be viewed as the extra perturbations imposed on the samples. Correspondingly, two consistency loss terms targeted for the labeled samples and the unlabeled samples are used to regularize the model predictions to be invariant to the extra perturbations. The effectiveness and necessity of the proposed two consistency manners are validated in the experimental study.
2)
Though the labeled samples are limited in the SSL task, the proposed data augmentation strategies conducted on the labeled samples can greatly improve its diversity. Meanwhile, considering that the label information for the labeled samples still maintains, a standard cross entropy conducted on augmented labeled samples can greatly improve the identification performance in addressing the SSL task.
3)
The appreciable label information for the unlabeled samples is estimated by conducting the average and highlight on the one-hot labels for the augmented unlabeled samples. Then, an unsupervised consistency loss is formulated to minimize the discrepancy between the label distributions for the original unlabeled samples and its appreciable label distributions. The experimental result illustrates that this regularization method can also further improve the identification performance to a certain extent.

The rest of this paper is organized as follows: Firstly, the principle of the proposed method are given in Section 2. Then, an experimental case study is presented in Section 3 to validate the effectiveness of the proposed method. Finally, conclusions are drawn in Section 4.

Section snippets

The algorithm of the proposed semi-supervised method

The SSL task in the field of mechanical fault diagnosis is investigated in this study. Taking an assumption that the labeled dataset is represented as $D_{ld} = {\{(x_{i}^{ld}, y_{i}^{ld})\}}_{i = 1}^{n_{ld}}$ , where $x_{i}^{ld} \in R^{N}$ means the i-th labeled data and N is the time length of the input data, $y_{i}^{ld} \in [1, \dots, m, \dots M]$ denotes the m-th healthy condition for the i-th labeled data, $n_{ld}$ is the number of the labeled samples. Correspondingly, the unlabeled dataset is expressed as $D_{uld} = {\{(x_{i}^{uld})\}}_{i = 1}^{n_{uld}}$ , where $x_{i}^{uld} \in R^{N}$ is the unlabeled data and $n_{uld}$ is

Experimental setup

The experimental bearing data are acquired from a bearing test-rig as shown in Fig. 4 which constitute the dataset in this study. A single-row angular contact ball bearing of type ER-16 K (the geometrical parameter of the bearing is given in Table 3) is used as the testing bearing in the experiment where five bearing operation conditions, namely, healthy (H), inner race fault (IRF), ball fault (BF), outer race fault (ORF) and a compound fault (CF) are simulated in the experiment. The simulated

Conclusion

In this study, a new SSL method for intelligent fault diagnosis of rolling bearings is presented on the principle of consistency regularization. In the proposed method, a DAM consisting eight DA strategies is proposed to impose extra perturbations on both labeled and unlabeled samples. A label predicting process is then put forward to produce the appreciable label distributions for the unlabeled samples using the model predictions for the augmented unlabeled samples. At the same time, two

CRediT authorship contribution statement

Kun Yu: Conceptualization, Methodology, Writing - original draft. Hui Ma: Supervision. Tian Ran Lin: Validation, Writing - review & editing. Xiang Li: Resources, Supervision, Data curation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This project is supported by the National Natural Science Foundation (Grant nos. 11972112, 11772089), the Fundamental Research Funds for the Central Universities (Grant nos. N170308028, N2003014, N180708009 and N180306005) and LiaoNing Revitalization Talents Program (Grant No. XLYC1807008). This project is also support by the “Taishan Scholar” program from Shandong provincial government of the People’s Republic of China. We would like to express our deepest appreciation for the valuable

References (53)

K.K. Chen et al.
Calculation of mesh stiffness of spur gears considering complex foundation types and crack propagation path
Mech. Syst. Sig. Process.
(2019)
H. Ma et al.
Time-varying mesh stiffness calculation of cracked spur gears
Eng. Fail. Anal.
(2014)
X. Zhao et al.
Fault diagnosis of rolling bearing based on feature reduction with global-local margin Fisher analysis
Neurocomputing
(2018)
X. Kong et al.
A multi-ensemble method based on deep auto-encoders for fault diagnosis of rolling bearings
Measurement
(2020)
L. Jing et al.
A convolutional neural network based feature learning and fault diagnosis method for the condition monitoring of gearbox
Measurement
(Dec. 2017)
J. Lei et al.
Fault diagnosis of wind turbine based on Long Short-term memory networks
Renewable Energy
(2019)
Zenghui An et al.
A novel bearing intelligent fault diagnosis framework under time-varying working conditions using recurrent neural network
ISA Trans.
(2020)
A. Altan et al.
Digital currency forecasting with chaotic meta-heuristic bio-inspired signal processing techniques
Chaos, Solitons Fractals
(2019)
K. Zhang et al.
Fault diagnosis of planetary gearbox using a novel semi-supervised method of multiple association layers networks
Mech. Syst. Signal Process.
(2019)
L. Jiang et al.
Semi-supervised fault classification based on dynamic sparse stacked auto-encoders model
Chemom. Intell. Lab. Syst.
(2017)

X. Li et al.

Semi-supervised gear fault diagnosis using raw vibration signal based on deep learning

Chin. J. Aeronaut.

(2020)

Z. Shi et al.

A data augmentation method based on cycle-consistent adversarial networks for fluorescence encoded microsphere image analysis

Signal Process.

(2019)

R.B. Randall et al.

Rolling element bearing diagnostics-A tutorial

Mech. Syst. Signal Process.

(2011)

H. Cao et al.

Mechanical model development of rolling bearing-rotor systems: A review

Mech. Syst. Signal Process.

(2018)

W. Zhang et al.

Deep residual learning-based fault diagnosis method for rotating machinery

ISA Trans.

(2019)

K. Yu, T.R. Lin, H. Ma, H. Li, J. Zeng, A combined polynomial chirplet transform and synchroextracting technique for...

H.T. Shi et al.

Model-based uneven loading condition monitoring of full ceramic ball bearings in starved lubrication

Mech. Syst. Signal Process.

(2020)

Z. Pan et al.

A two-stage method based on extreme learning machine for predicting the remaining useful life of rolling-element bearings

Mech. Syst. Sig. Process.

(2020)

Y. Liu et al.

Application of weighted contribution rate of nonlinear output frequency response functions to rotor rub-impact

Mech. Syst. Sig. Process.

(2020)

F. Zhang et al.

Mechanism and method for outer raceway defect localization of ball bearings

IEEE Access

(2020)

A. Altan et al.

Model predictive control of three-axis gimbal system mounted on UAV for real-time target tracking under external disturbances

Mech. Syst. Sig. Process.

(2020)

R. Razavi-Far et al.

Correlation clustering imputation for diagnosing attacks and faults with missing power grid data

IEEE Trans. Smart Grid

(2020)

Y. Lei et al.

Applications of machine learning to machine fault diagnosis: A review and roadmap

Mech. Syst. Signal Process.

(2020)

F. Chen, M. Cheng, B. Tang, W. Xiao, B. Chen and X. Shi, “A novel optimized multi-kernel relevance vector machine with...

X. Zhao et al.

Deep laplacian auto-encoder and its application into imbalanced fault diagnosis of rotating machinery

Measurement

(2020)

X. Yan, Y. Liu, M. Jia, Health condition identification for rolling bearing using a multi-domain indicator-based...

Cited by (55)

Vibration measurement from an adaptive phase-based motion estimation using parameter optimised log-Gabor filter
2024, Measurement: Journal of the International Measurement Confederation
Phase-based motion estimation has gained popularity due to its sub-pixel accuracy and high-spatial resolution. Recently, various phase extraction techniques were incorporated to improve its motion extraction performance. However, existing methods cannot directly determine the filter parameters based on the motion estimation accuracy, instead relying on the local amplitude magnitude as the selection index, which may not directly prove that such a filter can give the best motion extraction results. This study proposes a data-driven approach to automatically select optimal filter parameters based on the accuracy of motion extraction. An optimisation algorithm and a synthetic video with predefined motion are combined to find the optimal filter parameters. Log-Gabor filter is introduced to extend the range of parameters. Simulation results demonstrate the feasibility and analyse the impact of various factors. The superiority is validated by a vibration experiment, which demonstrated a significantly higher level of motion extraction accuracy compared to other algorithms.
A label information vector generative zero-shot model for the diagnosis of compound faults
2023, Expert Systems with Applications
Diagnosis of compound faults remains a challenge owing to the coupling of fault characteristics and the exponential increment of the number of possible fault types. Current compound faults diagnostic methods often require a large number of training data for each type of compound fault. In real-world scenarios, training data of compound faults are usually difficult to acquire and sometimes even inaccessible. In contrast, single fault samples are much easier to obtain. Thus in this paper inspired by the idea of zero-shot learning, we present a novel label information vector generative zero-shot model to identify unknown compound faults, using only single fault samples as the training set. This model comprises several modules, namely label information vector (LIV) definition, feature extractor, and generative adversarial modules, respectively responsible for representing the prior knowledge of specific class labels for the single fault and compound fault, extraction of fault features, and mapping the relationship between the fault features and the fault LIVs. By adversarial training between the samples and LIVs of single faults, the model can generate compound fault features using the compound fault LIVs. Thus the unknown compound faults are identified by measuring the distance between the features extracted from the testing compound fault samples and the generated features from LIVs. The proposed method is evaluated on a self-built experimental platform. The results demonstrate that without any compound fault samples in the training set, the compound fault classification accuracy of the model reaches 78.10%.
Interpretable hierarchical error correction GRU model for effective observation selection
2023, Applied Soft Computing
Interpretability of extended target effective observation selection model in artificial intelligence (AI) symbolizes a severe challenge: both accuracy effective observation selection while ensuring real-time performance with an extended target multiple observation model and physical interpretability are vital. To meet the challenge, this paper proposes a novel hierarchical Gated Recurrent Unit (GRU) interpretable model. The contribution of this study lies in filling the research gap in the field of effective observation processing for extended targets and intuitively explaining the spatiotemporal visualization of the model. Compared with other interpretable models, the proposed model has improved data clustering and feature extraction capabilities, noise robustness, and computational resource consumption in the face of complex extended target multivariate physical factors. More importantly, interpretable studies are presented on the spatiotemporal characteristics of the multiple observation model of the extended target. The experimental results in real-world scenarios demonstrate that the proposed model is better than benchmark models in terms of prediction performance. The visualization and interpretation of the integrated predictive model reflects the reasonability of the proposed ensemble model.
Semi-supervised learning for industrial fault detection and diagnosis: A systemic review
2023, ISA Transactions
The automation of Fault Detection and Diagnosis (FDD) is a central task for many industries today. A myriad of methods are in use, although the most recent leading contenders are data-driven approaches and especially Machine Learning (ML) methods. ML algorithms fall into two main categories: supervised and unsupervised methods, depending on whether or not the instances are labeled with the expected outputs. However, a new approach called Semi-Supervised Learning (SSL) has recently emerged that uses a few labeled instances together with other unlabeled instances for the training process. This new approach can significantly improve the accuracy of conventional ML models for industrial environments where labeled data are scarce. SSL has been tested as a promising solution over the past few years for several FDD problems, although there have been no systemic reviews of this sort of approach up until the present review. In this study, an attempt to organize the existing literature on SSL for FDD using the taxonomy of van Engelen & Hoos is reported. The most and the least frequently used SSL algorithms are identified and considered in terms of different fault detection tasks and their most common dataset structure. Moreover, a set of best practices are proposed in the conclusions of this work for implementation under real industrial conditions, so as to avoid some of the most common faults.
Feature-level consistency regularized Semi-supervised scheme with data augmentation for intelligent fault diagnosis under small samples
2023, Mechanical Systems and Signal Processing
Intelligent fault diagnosis based on machine learning has yielded a wealth of research results. However, fault diagnosis under small samples is still a challenging problem due to the difficulty of collecting fault data of machines in engineering scenarios. To address this problem, this paper proposes a feature-level consistency regularized semi-supervised scheme with data augmentation for fault diagnosis of machines. In the proposed method, generative adversarial networks generate unlabeled data to augment the limited training set. The fault identification network in the proposed method is trained by the augmented data and a small amount of real data collected from machines in a semi-supervised learning manner. We construct a distance evaluation metric based on the Earth Mover's Distance (EMD) and further design a novel feature consistency regularization module, which helps the fault identification network learn robust fault features by minimizing the EMD between generated data features and real data features. To verify the effectiveness of the proposed model, two mechanical fault simulation experiments were carried out. The experimental results show that the proposed method achieves high fault diagnosis accuracy using only a small amount of training data samples.
An imbalanced semi-supervised wind turbine blade icing detection method based on contrastive learning
2023, Renewable Energy
Wind power has emerged as a crucial renewable energy source, experiencing significant growth in recent years. However, blade icing remains a pressing challenge in the operation of wind turbines, potentially resulting in systems faults and component damage. Traditional approaches to blade icing detection often rely on domain expertise, incurring additional costs. While data-driven techniques have proven effective in detecting blade icing, they require substantial amounts of labeled data for model training, which can be time-consuming and prohibitively expensive. Furthermore, blade icing detection data is often highly imbalanced since wind turbines typically operate under normal conditions for extended periods. To address these issues, we propose a novel method based on unified imbalanced semi-supervised contrastive learning (UISSCL) that can simultaneously address class imbalance scenarios and semi-supervised scenarios. UISSCL integrates unsupervised and supervised contrastive learning into a unified framework capable of extracting discriminative features from both labeled and unlabeled imbalanced data. A linear classifier is then trained based on the representations learned from the contrastive learning approach. The results obtained from computational experiments on two wind turbine blade icing datasets demonstrate that our method outperforms state-of-the-art methods in both the supervised and semi-supervised settings integrating with class imbalance scenarios.

View all citing articles on Scopus

View full text

A consistency regularization based semi-supervised learning approach for intelligent fault diagnosis of rolling bearing

Highlights

Abstract

Introduction

Section snippets

The algorithm of the proposed semi-supervised method

Experimental setup

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgment

Mech. Syst. Sig. Process.

Eng. Fail. Anal.

Neurocomputing

Measurement

Measurement

Renewable Energy

ISA Trans.

Chaos, Solitons Fractals

Mech. Syst. Signal Process.

Chemom. Intell. Lab. Syst.

Chin. J. Aeronaut.

Signal Process.

Mech. Syst. Signal Process.

Mech. Syst. Signal Process.

ISA Trans.

Model-based uneven loading condition monitoring of full ceramic ball bearings in starved lubrication

Mech. Syst. Signal Process.

A two-stage method based on extreme learning machine for predicting the remaining useful life of rolling-element bearings

Mech. Syst. Sig. Process.

Application of weighted contribution rate of nonlinear output frequency response functions to rotor rub-impact

Mech. Syst. Sig. Process.

Mechanism and method for outer raceway defect localization of ball bearings

IEEE Access

Model predictive control of three-axis gimbal system mounted on UAV for real-time target tracking under external disturbances

Mech. Syst. Sig. Process.

Correlation clustering imputation for diagnosing attacks and faults with missing power grid data

IEEE Trans. Smart Grid

Applications of machine learning to machine fault diagnosis: A review and roadmap

Mech. Syst. Signal Process.

Deep laplacian auto-encoder and its application into imbalanced fault diagnosis of rotating machinery

Measurement