Elsevier

Journal of Process Control

Volume 99, March 2021, Pages 107-119
Journal of Process Control

Unsupervised isolation of abnormal process variables using sparse autoencoders

https://doi.org/10.1016/j.jprocont.2021.01.005Get rights and content

Highlights

  • Sparse autoencoders reveal relations between process variables.

  • Abnormal deviations in process variables cause shifts in autoencoder’s residual space.

  • Sparsity enhances residual movements of abnormal process variables.

  • Variables are isolated by backpropagating residual movements through the autoencoder.

Abstract

Isolation of abnormal changes in process variables is an integral component of fault diagnosis, as it provides evidential information for determining the root cause of a detected abnormal event. This task is challenging when the approach to diagnosis does not incorporate knowledge of the process’ nominal behavior, but is instead established solely on historical process data. Though isolation of abnormal changes in variables may be facilitated by including historical process data for faults that have been previously diagnosed, inconclusive results will remain for unfamiliar faults. This paper presents a method for isolating abnormal changes in process variables with an autoencoder (AE) - a type of neural network configured for latent projection — and without prior knowledge of nominal process behavior or faults. The AE is optimized with nominal process data as well as a sparsity constraint to produce a sparse network. Probing into the sparse AE allows one to gain insight into the correlations that exist among the process variables during normal process operation. Movements in the AE’s reconstruction space are interrogated alongside the acquired knowledge to isolate the abnormal changes in process variables. The method is demonstrated with a simulation of a nonlinear triple tank process, and is shown to isolate abnormal changes in variables for both simple and complex faults.

Introduction

Process operators are regularly confronted with the task of proposing an appropriate cause to an abnormal event. Operators diagnose a fault by isolating abnormal changes in the signals of process variables; a probable cause is then assigned given the identified aggregation of signal changes. Complete reliance on operators for signal evaluation becomes difficult as process plants become larger and more complex. An increasing number of observable process variables will lead to information overload, slowing down analysis and risking incorrect diagnosis. Recent advances in data storage technologies promote industry to archive historical process data [1]. The growing availability of historical process data, coupled with increasing process complexity, has lead to an increase in research on methods for multivariate statistical process monitoring that rely solely on data and not on process knowledge.

Methods for multivariate statistical process monitoring can be grouped into two different approaches, namely, statistical fault classification and feature extraction. Fault classification is the problem of identifying to which of a set of faults a new observation (sample) belongs. However, developing an effective classifier requires an abundant number of training observations for every possible fault. Obtaining a sufficient amount of training data proves difficult when faults, regardless of their severity, rarely occur. Feature extraction is the process of deriving numerical quantities intended to be informative about a data set. It is applied to process monitoring by comparing the features of new observations with the features of a training data set that describes the nominal behavior of a process. A fault is detected when the disparity, usually represented by a monitoring statistic, exceeds a certain threshold. Since it is often the case that historical process data contains disproportionately more training samples explaining nominal process behavior, monitoring based on feature extraction is generally favored over fault-classification.

The focus of this paper is on latent projection (LP) - a numerical method for feature extraction. LP reduces the dimension of the process variable space to a set of features that retain information in the original variables. LP uncovers the nominal correlation structure among variables by summarizing correlated variables with a smaller set of principal variables. A model given by LP is used for fault detection by identifying abnormal changes in the correlation structure among process variables.

Venkatasubramanian et al. [2] propose that a successful diagnostic system is a hybrid of three diagnostic components: (a) a data-driven method for quick detection; (b) a trend-based method for assessing abnormal changes (shifts) in process variables; and (c) an expert system that proposes a root cause given the result from trend analysis. In the context of LP, much of the available literature addresses the first diagnostic component. Within the class of linear methods, principal component analysis (PCA) and partial least squares have been successfully applied to linear systems where process data follows the assumption of normality [3], [4], [5]. For nonlinear systems where the normality assumption is not met, independent component analysis, kernel PCA, and neural networks demonstrate superior performance [6], [7], [8]. Ku et al. [9] propose dynamic LP, where the process variable vector is extended with past samples to include dynamic behavior in the LP model. An overview of these methods is provided in [10].

Component wise residual analysis in the form of contribution plots is a well-established approach for assessing abnormal changes in process variables [11], [12]. Contribution plots indicate the contribution of each process variable to the monitoring statistic. If the statistic exceeds its control limit, then the variables exhibiting the largest contributions are investigated. However, LP produces a fault smearing effect such that the signal characteristics of abnormal variables smear onto nominal variables [13]. Identifying a probable cause becomes a challenge since the results from the analysis are ambiguous. Yoon and MacGregor [14] propose a workaround that addresses the fault smearing effect by comparing the normalized contributions with the diagnosed contributions of previous abnormal events. However, the method only applies to abnormal events that have occurred before, and thus fault smearing remains an issue for unfamiliar faults.

Qin and Alcala [5], [13] propose a fault reconstruction-based approach to abnormal trend analysis. Upon detecting an abnormal sample with a LP model, abnormal changes in process variables are isolated by correcting the effect of a fault on the abnormal sample such that the nominal (non-faulty) values are estimated. Since there is no prior knowledge of the detected fault and its fault directions, reconstruction-based methods must carry out a search for the abnormal variables, which becomes computationally expensive for large processes. Combinatorial optimization methods such as the branch and bound algorithm may be integrated into the reconstruction-based method to improve the search efficiency [15].

Results from contribution analysis and reconstruction-based analysis do not provide a root cause to an abnormal event, but rather a list of process variables that are influenced by the abnormal event. A root cause is determined by assessing the abnormal variables with respect to a qualitative understanding of the relationship between process components, process functions, and control architecture [2]; the success of this final diagnostic component is thus dependent on the success of trend analysis at isolating the abnormal shifts in process variables correctly.

Neural networks are known to be potential universal approximators for any nonlinear function [16], [17], [18]. A neural network is configured for nonlinear LP by including a bottleneck network layer that reduces the original variable space to a lower dimension. The network is optimized to (a) learn a nonlinear transformation to the bottleneck layer; and (b) learn a nonlinear transformation that reconstructs the original variables [19]. Such networks, termed autoencoders (AEs), have been proposed for abnormal event diagnosis of nonlinear processes. Fault detection with AEs is shown to outperform other models given by LP methods such as PCA, independent component analysis, and kernel PCA [6], [20], [21], [22], [23], [24], [25]. This result is attributed to the superior nonlinear modeling capacity of AEs. Several methods have been proposed to improve the evaluation of abnormal trends in process variables with an AE. Hallgrímsson et al. [26] augment the optimization of an AE with a sparsity constraint to produce a sparse AE, resulting in a reduction of the fault smearing effect as the contributions from process variables uncorrelated with the faulty variables are eliminated. The effect of sparsity on fault diagnosis is also explored in the works of Yu et al. [27], [28] where process variables affected by a fault are isolated with a sparse discriminant analysis of a sparse AE to offer superior diagnosis over contribution analysis. Sparse AEs have also been used for the detection and localization of anomalies in images; Sabokrou et al. [29] and Touati et al. [30] promote sparsity with a penalty term that encourages network neurons to follow a Bernoulli distribution. Ren et al. [31] propose a reconstruction-based approach with a multilayered AE that learns to estimate a hypothetical faulty variable set with a gradient descent approach.

This paper proposes a contribution analysis-based method for detecting and isolating abnormal changes in process variables, whilst simultaneously remedying the complications induced by the fault smearing effect. The proposed method does not require prior knowledge of faults. Given a historical process data set sampled from when the process was consistent with normal operating conditions, an AE is optimized with a sparsity constraint to produce a sparse network, permitting one to probe into it to gain insight on the correlation structure among the process variables. The condition of the process is evaluated with a monitoring statistic that is sensitive to an abnormal event that changes the correlation among the process variables. Upon detecting an abnormal event, the contributions to the monitoring statistic are interrogated with the sparse AE to determine the direction of the abnormal shifts in process variables that ultimately explain the contributions. Unlike reconstruction-based approaches, the proposed method does not require an optimization problem to be solved. The proposed method is demonstrated with faults occurring in a simulated nonlinear process. The key result of this study was that interrogating the results from contribution analysis with the sparse AE allows for the isolation of abnormal shifts in process variables.

The organization of this paper is as follows. Section 2 reviews the method of LP in the context of PCA and AEs. Section 3 describes how AEs optimized with a sparsity constraint can expose process variable structure. Process monitoring with AEs and the isolation of abnormal process variables is discussed in Section 4. Section 5 presents the results from diagnosing two different faults occurring in a nonlinear process. The last two sections provide a discussion and conclusion, respectively, of the results.

Section snippets

Latent Projection (LP)

LP is a numerical method that transforms a high-dimensional variable space to a smaller set of latent, principal variables that retain essential information about the original variables. The method sets a compromise between the degree of dimensionality reduction and loss of information. LP has seen increased application in process monitoring as large processes consisting of many process variables can be monitored with a smaller number of principal variables. Let xRm×1 represent a vector of

Discovery of process knowledge

The performance of a neural network model is largely determined by its model complexity. Its prediction accuracy is generally improved by increasing the number and size of network layers. However, proceeding in such a direction has a tendency to overfit the network to training data such that it performs poorly on validation and test data. This is undesirable in multivariate statistical fault diagnosis due to the disparity between the training and test sets; both are sampled from the same

Online process monitoring and fault contribution analysis

Online process monitoring consists of referring new variable samples against an “in-control” AE trained with historical data collected when only common cause variation was present in the process. New observations are reconstructed by propagating them through the AE to obtain the residuals enew=xnewxˆnew. Previously unseen changes in signal characteristics caused by an abnormal event are detected by computing the SPE (otherwise known as the Q monitoring statistic) of the residuals [4]: SPE=i=1m

Case study: The triple tank process

The proposed method for diagnosis is demonstrated with a simulated triple tank process (TTP) in this section. The TTP - a multivariate, nonlinear process — is a variant of the quadruple tank process [42]. A schematic drawing of the TTP is given in Fig. 10. The liquid supplying the upper tanks is transported from a large sump by the means of two gear pumps. Liquid flows out from the bottom of each tank, with the liquid from the upper right tank first supplying the lower tank before returning to

Discussion

Since the LP model in this paper is a static sparse AE, diagnosis performs poorly when process variable samples contain temporal information. Reference changes (which induce dynamic process behavior) cause the SPE in Fig. 14 to exceed its control limit, generating false alarms and hampering diagnosis. The process must reach steady-state to confirm that a false alarm has occurred, visualized by the SPE returning to below the control limit. This behavior is explained by reference to Fig. 20.

Conclusion

This paper proposes a method based on sparse autoencoders (AEs) for detecting and isolating abnormal shifts in process variables. The proposed method does not require prior knowledge of faults. In the proposed method, an AE is optimized to reduce the dimensions of a process variable space with historical process data — data that was sampled from a process that was consistent with normal operating conditions. The AE’s optimization is augmented with naïve elastic net regularization to shrink

CRediT authorship contribution statement

Ásgeir Daniel Hallgrímsson: Conceptualization, Methodology, Software, Validation, Formal analysis, Resources, Data curation, Writing - original draft, Visualization. Hans Henrik Niemann: Conceptualization, Writing - review & editing, Supervision. Morten Lind: Conceptualization, Writing - review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to acknowledge the support of the Danish Hydrocarbon Research and Technology Center (DHRTC) at the Technical University of Denmark.

References (43)

Cited by (10)

  • Latent variable models in the era of industrial big data: Extension and beyond

    2022, Annual Reviews in Control
    Citation Excerpt :

    Then the contribution of each process variable to the SPE is calculated to analyze and isolate abnormal process variables. Besides, Hallgrímsson, Niemann, and Lind (2021) further set the contribution to a non-square form. The shift of reconstructed variables can be back-propagated through sparse AE to obtain the shift of original variables, so that a series of causal paths can be obtained.

View all citing articles on Scopus
View full text