Dynamic statistical process monitoring based on generalized canonical variate analysis

doi:10.1016/j.jtice.2020.07.007

Journal of the Taiwan Institute of Chemical Engineers

Volume 112, July 2020, Pages 78-86

https://doi.org/10.1016/j.jtice.2020.07.007 Get rights and content

Highlights

•
A novel generalized canonical variate analysis (GCVA) algorithm is formulated.
•
The GCVA can explicitly extract dynamic and static latent variables from time-serial data.
•
Comparisons have demonstrated the superiority and effectiveness of the GCVA-based method.

Abstract

A novel generalized canonical variate analysis (GCVA) algorithm is formulated and then applied for data-driven dynamic process monitoring. The proposed GCVA algorithm seeks for different projecting bases for the time-serial samples, so that the sum of squared canonical correlation coefficients between all pairs of the projected latent variables could be maximized. The corresponding dynamic process monitoring scheme first utilizes GCVA to explicitly extract dynamic and static latent variables from the time-serial data, simultaneously. Second, a multivariate regression model is employed for describing the time-serial relationship between the dynamic latent variables, the model residual then services as a good indicator for the inconsistency in the defined time-serial mechanism. For online monitoring purposes, two combined monitoring indices are proposed for detecting abnormalities in the time-dependent and time-independent variations, respectively. Additionally, reconstruction-based contribution indices are also derived for fault diagnosis accordingly. Finally, the capability of the GCVA algorithm in exploiting the time-serial correlation inherited in the given data is demonstrated, the effectiveness and superiority of the proposed GCVA-based approach over other counterparts are validated as well, through comparisons on two dynamic industrial processes.

Introduction

The importance of ensuring health operation of industrial processes keeps inducing the need to design efficient monitoring systems for trustfully fault detection and diagnosis. Nowadays, with the growing complexity and wider application of computer-aided devices in modern plants, the availability of massive process data has been witnessing the popularity of data-driven process monitoring approaches for decades [1], [2], [3]. Generally, the essence of implementing data-driven process monitoring is the development of a model that characterizes the normal signature of process data sampled from the normal operating condition. Faults are then defined as a deviation from this normality above a threshold. As such, there are many multivariate analytical algorithms, like principal component analysis (PCA), can be applied in process monitoring [4], [5], [6], [7]. Different analytical algorithms explore information of different latent variables to the fault detection as well as fault diagnosis.

Given that the measurements in modern industrial plants could be highly time dependent, the time-serial correlated characteristic (or auto-correlation) inherited in the given data is required to be taken into account. To tackle this sort of dynamic process monitoring issue, Ku et al [8] pioneered to augment each sample with a number of previous measured samples before the PCA algorithm is performed, a dynamic PCA (DPCA) model was then resulted for dynamic process monitoring. Through utilizing the same augmenting strategy, different dynamic process monitoring methods involving different analytic algorithms have been proposed in the literature [9, 10]. The canonical variate analysis (CVA) also called canonical correlation analysis elsewhere, provides an alternative for modeling time-serial data as well [11], [12], [13]. The CVA algorithm represents the time-serial correlated characteristic by constructing state variables from the past samples to explain the future variabilities. Moreover, Choi et al. [14] and Kerkhof et al. [15] investigated the feasibility of the multivariate autoregressive (AR) model in modeling the time-serial correlation in the bath processes.

Furthermore, Miao et al [16] proposed a novel dynamic process monitoring approach based on time neighborhood preserving embedding (TNPE) model. The TNPE reconstructs each sample from its time-serial neighbors instead of distance neighbors, the consideration of time-serial relationship of a data manifold in dimensionality reduction can also uncover dynamic latent variables for dynamic process monitoring. Recently, Li et al. [17] developed a dynamic latent variable (DLV) model through maximizing the variance of a weighted sum of lagged latent variables. The resulted DLV model can extract auto-correlated latent variables and statistically time-independent latent variables, sequentially. Similarly, Dong and Qin [18] formulated a dynamic-inner PCA (DiPCA) algorithm for extracting dynamic latent variables with maximal auto-covariance.

There are some other types of dynamic process monitoring approaches available in the literature. For example, identifying state-space models has also been found to be functional in modeling and monitoring dynamic processes [19,20]. With the utilization of kernel functions, the aforementioned methods could be extended to handle the nonlinearity in the given data [21,22]. Once a fault has been detected, the task of fault diagnosis is then activated. An examination of the existing literature on fault diagnosis shows that contribution plots are typically employed to isolate the source variables associated with the fault. An alternative to the classic contribution plots, referred to as reconstruction-based contribution (RBC) plots, has been proposed by Alcala and Qin [23]. The RBC calculates the contribution of each monitored variable in concert with the monitoring statistics, it can thus provide more accurate fault diagnosis results in contrast to the classic contribution plots [23].

Generally, the extraction of latent variables representing the time-serial correlated characteristic inherited in the given data should be orientated towards maximal canonical correlation. In comparison with the canonical correlation coefficients that considered in the CVA algorithm, the consideration of maximal variance or auto-covariance can only reflect partial auto-correlated variation since the collinear and/or systematic variation can also dominate a latent variable that satisfies the maximal variance or auto-covariance. From this viewpoint, maximizing the canonical correlation coefficients between every single pair of latent variables is the appropriate way for auto-correlated feature extraction. Motivated by this recognition, a generalized CVA (GCVA) algorithm is proposed for time-serial data modeling and dynamic process monitoring purposes. The GCVA seeks for different projecting bases for time-serial samples so that the squared canonical correlation coefficients of all the possible pairs of latent variables are maximized.

The proposed GCVA-based dynamic process monitoring scheme first extracts dynamic latent variables that dominate the time-serial correlated variation, as well as static latent variables that are time-independent. A multivariate regression model is then employed for describing the time-serial relationship of the dynamic latent variables, the corresponding model residual could be a good indicator for the inconsistency in the defined time-serial mechanism. Moreover, the static latent variables representing the time-independent variation would be monitored as well. Therefore, the proposed GCVA-based approach provides an explicit decomposition for the time-serial process data, and uncovers two different types of variations inherited from the given data, i.e., time-dependent and time-independent variations, for process monitoring purposes.

Section snippets

DPCA and CVA

Through augmenting each sample in the training dataset X = [x₁,x₂,⋅⋅⋅, x_n]^T ∈ R^n × m with its previous measured d samples according to the following: $\tilde{X} = {[\begin{matrix} x_{d + 1} & x_{d + 2} & \dots & x_{n} \\ x_{d} & x_{d + 1} & \dots & x_{n - 1} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ x_{1} & x_{2} & \dots & x_{n - d} \end{matrix}]}^{T} \in R^{(n - d) \times m (d + 1)}$ the auto-correlation in the dataset X is then mixed with the cross-correlation, where x_i ∈ R^m × 1 is the i-th sample with i = 1, 2, ⋅⋅⋅, n, n and m are the numbers of samples and measured variables, respectively. The standard PCA algorithm can then be performed on the augmented matrix $\tilde{X}$

GCVA-based dynamic process monitoring

The GCVA algorithm is proposed to uncover the time-serial correlation through projecting the time-serial samples (i.e., x_t,x_t − 1, ⋅⋅⋅, x_t − d) onto corresponding projecting bases (i.e., W_d + 1, W_d,⋅⋅⋅, W₁), the formulation of the GCVA algorithm is conceptually displayed in Fig. 1.

A numerical dynamic system

The monitoring task of the following discrete dynamic system is first considered: $z (t) = Az (t - 1) + Bu (t - 1) + e (t)$ $y (t) = Cz (t) + [\begin{matrix} 0.320 & - 0.749 \\ - 0.263 & 0.689 \\ - 0.320 & 0.285 \\ 0.389 & 0 \\ 0 & - 0.543 \end{matrix}] u (t) + v (t)$ where z(t) ∈ R^3 × 1 denotes the state vector at the t-th sampling step, e(t) and v(t) are random noises following Gaussian distribution with zero mean and standard deviation to be 0.2 and 0.5, respectively. The input vector u(t) ∈ R^2 × 1 is generated by: $u (t) = Du (t - 1) + [\begin{matrix} 0.193 & 0.689 \\ - 0.320 & - 0.749 \end{matrix}] w (t)$ where w(t) ∈ R^2 × 1 is a

Conclusion

A novel GCVA algorithm with application to fault detection and diagnosis in dynamic processes has been presented. The GCVA is formulated to maximize the sum of squared canonical correlation between every single pair of latent variables that projected from time-serial samples. The capability of exploiting the time-serial correlation inherited in the given data has been demonstrated, the superiority and effectiveness of the proposed GCVA-based dynamic process monitoring scheme over other

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was sponsored by the National Natural Science Foundation of China (61773225), the Natural Science Foundation of Zhejiang Province (LY20F030004), K.C.Wong Magna Fund in Ningbo University, and the Fundamental Research Funds for the Central Universities under Grant 222201817006.

References (32)

K. Severson et al.
Perspectives on process monitoring of industrial systems
Annu Rev Conrol
(2016)
Y. Zheng et al.
Parallel projection to latent structures for quality-relevant process monitoring
J Taiwan Inst Chem Eng
(2017)
T. Lan et al.
Statistical monitoring for non-Gaussian processes based on MICA-KDR method
ISA Trans
(2019)
L. Zhou et al.
Multiple probability principal component analysis for process monitoring with multi-rate measurements
J Taiwan Inst Chem Eng
(2019)
L. Guo et al.
A multi-feature extraction technique based on principal component analysis for nonlinear dynamic process monitoring
J Process Control
(2020)
T. Lan et al.
KPI relevant and irrelevant fault monitoring with neighborhood component analysis and two-level PLS
J Frankl Inst
(2018)
W. Ku et al.
Disturbance detection and isolation by dynamic principal component analysis
Chemom Intell Lab Syst
(1995)
J. Zhu et al.
Monitoring big process data of industrial plants with multiple operating modes based on Hadoop
J Taiwan Inst Chem Eng
(2018)
B. Jiang et al.
Fault detection of process correlation structure using canonical variate analysis-based correlation features
J. Process Control
(2017)
X. Fan et al.
Direct calibration transfer to principal components via canonical correlation analysis
Chemom Intell Lab Syst
(2018)

Y. Dong et al.

A novel dynamic PCA algorithm for dynamic data modeling and process monitoring

J Process Control

(2018)

G. Li et al.

Kernel dynamic latent variable model for process monitoring with application to hot strip mill process

Chemom Intell Lab Syst

(2017)

C. Tong et al.

Statistical process monitoring based on nonlocal and multiple neighborhoods preserving embedding model

J Process Control

(2018)

G. Li et al.

Comparative study on monitoring schemes for non-Gaussian distributed processes

J Process Control

(2018)

K. Ghosh et al.

Evaluation of decision fusion strategies for effective collaboration among heterogeneous fault diagnostic methods

Comput Chem Eng

(2011)

J. Yang et al.

Concurrent monitoring of global-local performance indicators for large-scale process

J Taiwan Inst Chem Eng

(2019)

Cited by (15)

Enhanced dynamic latent variable analysis for dynamic process monitoring
2024, Journal of the Taiwan Institute of Chemical Engineers
Process dynamic, also known as temporal correlation, is widespread in industrial processes and can greatly affect process monitoring results. In dynamic process monitoring, dynamic latent variable (DLV) mainly considers the autocorrelation and cross-correlation of variables, while slow feature analysis (SFA) only considers the varying speed of variables. Complex dynamic information needs to be fully considered.
This paper proposes an enhanced dynamic latent variable (EDLV) analysis. First, EDLV focuses on both the varying speed and correlation of variables when extracting dynamic latent variables. Therefore, The proposed method achieves the distinction between the normal change of operating conditions and the occurrence of faults. Second, the process data is broken into dynamic and static subspaces for monitoring respectively, which benefits the accurate detection of different faults.
Tennessee Eastman (TE) process and three-phase flow facility are used to verify the effectiveness of the proposed method. It is proved that EDLV can divide dynamic process more reasonably and obtain better detection results.
Common canonical variate analysis (CCVA) based modeling and monitoring for multimode processes
2023, Chemical Engineering Science
Multiple operating modes are common in industrial processes due to feed stock alterations, product specifications, working environment changes and so on. Although different modes show different behaviors, some underlying process characteristics may stay invariable as mode changes, which reveal the essence information of the process. In this work, the issue of multimode process monitoring is studied with subspace separation, in which each mode is divided into the common subspace and specific subspace. A modified canonical variate analysis (CVA), termed as common CVA (CCVA), is put forward to extract the mode-common features based on joint approximate diagonalization method. The concatenation of all Hankel matrices is analyzed to find a common orthonormal set eigenvectors by minimization of joint diagonality criterion. Then, the remaining part of each mode is regarded as local specific subspace, which provides more representative information in each different mode. CVA algorithm is applied to build multiple local models based on mode-specific information. Two case studies, a numerical example and Tennessee Eastman (TE) process, are provided to validate the feasibility and effectiveness of the proposed method in monitoring abnormal operation for multimode processes.
Novel adaptive fault detection method based on kernel entropy component analysis integrating moving window of dissimilarity for nonlinear dynamic processes
2023, Journal of Process Control
Fault detection of nonlinear dynamic processes can ensure the safety of industrial production processes. Industrial process data are mostly autocorrelated along with strong nonlinear characteristics. And these significant characteristics interact with each other and limit the fault detection performance of traditional methods. Therefore, this paper presents a novel adaptive fault detection method for nonlinear dynamic processes based on kernel entropy component analysis (KECA) integrating the moving window of dissimilarity (DMW) (KECA-DMW). The KECA is used to map the raw data and capture the nonlinear features of the data, which combine with moving window techniques to build the fault detection model. In the process of updating the data in the moving window, the data information of the historical window is combined with that of the current window to obtain a more comprehensive judgment of the current moment. Then a dynamic update fusion method with adaptive weight allocation based on the dissimilarity index is proposed by analyzing the data characteristics of window information at different moments through the dissimilarity. Finally, three example studies with a numerical example, a closed-loop continuously stirred tank reactor and a Tennessee-Eastman process are used to validate the effectiveness of the proposed method. Compared with other nonlinear dynamic process fault detection methods, the results verify the effectiveness of the proposed method in the process monitoring performance of nonlinear dynamic processes in terms of false alarm rate and fault detection rate, where the false alarm rates of the proposed method are only 2%, 1.83%, and 4.33%, while the fault detection rates are 97.4%, 96.83%, and 86.25%, respectively.
A mixture of probabilistic predictable feature analysis for multi-mode dynamic process monitoring
2023, Journal of the Taiwan Institute of Chemical Engineers
Citation Excerpt :
In the past decades, traditional statistical methods including principal component analysis (PCA) [8], partial least squares (PLS) [9,10], canonical correlation analysis (CCA) [11,12], etc., have been favourably applied to complicated industrial processes. However, dynamic relations inhabited inside training samples are neglected in these methods, leading to limited monitoring performance in practical cases [13]. Researchers have reported many extensions of these conventional methods to alleviate the dynamic issue.
In modern industrial processes, the demand for safety and reliability is increasingly prioritized, spawning quantities of related research on process monitoring models. It is significant for process monitoring to consider multi-mode operating conditions simultaneously. This work proposes an efficient method based on multi-mode probabilistic predictable feature analysis (MPPFA) for complex industrial process monitoring. A strategy combining deep auto-encoder (DAE) and Gaussian mixture model (GMM) is first designed to extract low-dimensional features and identify possible running modes. In line with the obtained features, original training samples are classified into corresponding Gaussian distributions with reasonable probabilities. Further, local PPFA models are established based on the separated training batches. Five monitoring statistics, including $T_{P}^{2}$ , ${SPE}_{P}$ , and ${DI}_{P}$ for PPFAs, $T_{D}^{2}$ and ${SPE}_{D}$ for DAE, are defined to detect abnormalities. The proposed method is first applied to a three-phase flow facility, and its superiority over comparison methods has been verified. The effectiveness of the proposed method is further demonstrated through an experiment on a practical coal pulverising system.
Adaptive slow feature analysis - sparse autoencoder based fault detection for time-varying processes
2023, Journal of the Taiwan Institute of Chemical Engineers
Citation Excerpt :
The data-driven methods can make full use of a large amount of data and effectively reflect the operation state of the actual industrial process [4]. Therefore, they have been widely studied and applied in practice [5,6]. Multivariate statistical process monitoring (MSPM), as a classical data-driven method, has been applied in many practical industrial process fields [7,8].
Fault detection and diagnosis technology is of great significance for practical industrial processes. Industrial process characteristics change with time due to various reasons such as changing working conditions. This will cause false alarm or missing alarm of process monitoring.
In this paper, an adaptive slow feature analysis (SFA) - sparse autoencoder (SAE) algorithm is proposed to establish an adaptive model for time-varying process monitoring. Model update index is built based on time-varying characteristics extracted using SFA model. Process monitoring index is built based on sparse characteristics extracted using SAE model. Through online adaptive update strategy, updated monitoring model is realized to adapt to the time-varying characteristics of the process.
The proposed algorithm has good performance on penicillin fermentation process data set and can realize the task of adaptive process monitoring.
Two-dimensional multiphase batch process monitoring based on sparse canonical variate analysis
2022, Journal of Process Control
Citation Excerpt :
Dong et al. [25] employed an integration of CVA and Gaussian mixture model (GMM) for flow state monitoring, in which CVA is capable to extract flow state features and then GMM is used to establish model for different flow states. Lan et al. [26] developed a novel generalized canonical variate analysis (GCVA) method for dynamic process monitoring, the goal of which is to search different projecting bases and maximize the sum of squared canonical correlation. Despite the existing researches have demonstrated that CVA-based methods are advantageous for fault detection of dynamic processes, the batch-to-batch correlations are neglected when building monitoring models for batch processes.
Most industrial batch processes involve inherent dynamic characteristics in both within-batch time direction and batch-wise direction. In order to ensure process safety and improve process performance, the two-dimensional dynamics should be analyzed during batch process monitoring. In this work, two-dimensional region of support (2D-ROS) is first constructed to select and preserve the relevant samples for the current measured sample by calculating autoregressive orders with Akaike information criterion (AIC) in time direction and measuring the similarity with the weighted Euclidean distance in batch-wise direction. Afterwards, sparse canonical variate analysis (SCVA) algorithm is performed to yield sparse canonical vectors, which is especially advantageous for eliminating the irrelevant variables and facilitating the interpretation of underlying relationships of process variables. Meanwhile, given most measurements are subject to the non-Gaussian distribution, the upper control limits (UCLs) in 2D-SCVA can be estimated using kernel density estimation (KDE). The achieved results obtained from a numerical dynamic example and the benchmark fed-batch penicillin fermentation process clearly verify that the proposed method performs well for detecting abnormal operation for the batch processes.

View all citing articles on Scopus

View full text

Dynamic statistical process monitoring based on generalized canonical variate analysis

Highlights

Abstract

Introduction

Section snippets

DPCA and CVA

GCVA-based dynamic process monitoring

A numerical dynamic system

Conclusion

Declaration of Competing Interest

Acknowledgment

Annu Rev Conrol

J Taiwan Inst Chem Eng

ISA Trans

J Taiwan Inst Chem Eng

J Process Control

J Frankl Inst

Chemom Intell Lab Syst

J Taiwan Inst Chem Eng

J. Process Control

Chemom Intell Lab Syst

J Process Control

Chemom Intell Lab Syst

J Process Control

J Process Control

Comput Chem Eng

J Taiwan Inst Chem Eng