Key-performance-indicator-related state monitoring based on kernel canonical correlation analysis

https://doi.org/10.1016/j.conengprac.2020.104692Get rights and content

Abstract

As a multivariate statistical analysis method, canonical correlation analysis (CCA) performs well for state monitoring of linear processes, but most industrial processes are nonlinear. To solve this problem, kernel canonical correlation analysis (KCCA) has been adopted; however, KCCA still has key performance indicators (KPI)-related issue. In this paper, two improved KCCA methods are proposed to deal with KPI-related issue. One is performing singular value decomposition (SVD) on the correlation coefficient matrix, then the kernel matrix can be divided into KPI-related and KPI-unrelated parts. Another one is performing general singular value decomposition (GSVD) on two coefficient matrices. In addition, this paper also performs fault detectability analysis and computational complexity analysis on these two methods. Finally, the Tennessee Eastman (TE) process is used in this study to verify the efficacy of these two proposed methods.

Introduction

Modern industrial processes continue to become more complex, the issues of safety and quality of industrial products attract more and more attention. The process may not run properly or even can be shut-down due to faults or failures, it is necessary to design a fault detection system that belongs to the first layer of the functional process safety.

Nowadays multivariate statistical process monitoring (MSPM) (Ge et al., 2013, Li et al., 2019, Qin, 2012, Wang et al., 2018, Zhang and Zhang, 2010) performs well in fault detection in industrial processes. Principal component analysis (PCA) (Fan et al., 2014, Lou et al., 2018), partial least squares (PLS) (Wang and Yin, 2015, Yin et al., 2015, Zhang et al., 2015) and canonical correlation analysis (CCA) are common MSPM methods. MSPM method is decomposing process variables into the principal and the residual parts, and the Hotelling T2 statistic or the squared prediction error (SPE) statistic (Joe Qin, 2003) are used in process monitoring. CCA method was first proposed by Hotelling in 1936 (Hotelling, 1936), the CCA method can maximize the correlation relationship between two sets of variables. Recently, CCA-based process monitoring has received more and more attention and CCA-based method is performing well. Chen and his co-workers have proposed several improved CCA-based methods (Chen, Ding et al., 2016, Chen, Zhang et al., 2016) for static and dynamic fault detection and they also combine random algorithms to deal with the non-Gaussian problems.

Most MSPM methods are designed for linear processes, however, in modern industrial processes, the large amount of process data which are generated by nonlinear processes cannot be ignored. Then for nonlinear problems, the kernel function method is adopted, the kernel function method is to map a low-dimensional space to a high-dimensional feature space. Similarly to kernel principal component analysis (KPCA) and kernel partial least squares (KPLS) (Peng et al., 2013, Yi et al., 2017), kernel canonical correlation analysis (KCCA) method is proposed. At present, the KCCA-based method (Liu, Liu, Zhao & Xie, 2018) is used for industrial process state monitoring and has achieved good results.

Since the industrial processes are becoming more and more complex, it is necessary to ensure the safety of process, the stability of product quality, and satisfactory production efficiency. Rather than simply monitoring the fluctuations and anomalies of process variables, enterprise managers and engineers pay more attention on whether faults affect the final product quality, hence the state monitoring related to key performance indicators (KPIs) is very important. KPIs includes product quality, operation cost, maintenance cost, production rates, etc. The main idea of KPI-related state monitoring is building a relationship between process variables and KPI variables, dividing the process variable space into two subspaces, which are KPI-related and KPI-unrelated parts, and then monitoring the two parts separately.

At present, most KPI-related state monitoring methods are proposed for linear processes, such as PCA-based methods (Sun & Hou, 2017), PLS-based methods (He, Wang, & Liu, 2018) and CCA-based methods (Zhu, Liu, & Qin, 2017). Based on slow feature analysis (SFA), Zheng and Zhao (2019) propose a quality-related method for full decomposition of process variables in conjunction with CCA. Subsequently, Qin and Zhao (2019) propose a similar quality-relevant method considering closed-loop control. These methods perform well for state monitoring related to KPIs. However, nonlinearity is common in industrial processes, the performance of existing linear KPI-related methods will degrade. For nonlinear processes, Peng et al. (2013) proposed a total KPLS (TKPLS) method, however, TKPLS has too many statistics and complex detection logic, and TKPLS does not consider information of KPIs in the residual subspace. Then, the modified kernel partial least squares (MKPLS) (Jiao, Zhao, Wang, & Yin, 2017) is adopted, MKPLS divides the feature space into two orthogonal parts, which are KPI-related and KPI-unrelated parts, by performing the singular value decomposition (SVD) on the correlation matrix. Soon later, Si, Wang, and Zhou (2021) proposed a new method called KPI-KPLS, which divides the feature space through the general singular value decomposition (GSVD) way. However, KPLS has the following defects: (1) KPLS only uses the partial correlation information of some selected latent variables to maximize the covariances between KPIs and process variables, this may results in the loss of some relevant information (Wang, Jiao, & Yin, 2017); (2) KPLS can extract latent variables (LVs) of process variables which have large magnitude of variations but not necessarily highly correlated to KPIs (Liu, Zhu, Qin & Chai, 2018); (3) KPLS has heavy computation complexity due to its iterative algorithm and has a issue to select latent variables. KCCA can maximize the relationship between KPIs and process variables to make the process variables is highly correlated to KPIs. Compared with KPLS, the calculation complexity of KCCA is much smaller. Motivated by these advantages, two improved KCCA methods are proposed for KPI-related state monitoring.

In this paper, two modified KCCA methods are proposed for nonlinear KPI-related state monitoring. For convenience, one method is called MKCCA and the other is called KPI-KCCA. MKCCA is performing SVD on the correlation coefficient matrix and KPI-KCCA is performing GSVD on the matrices which are obtained by the correlation coefficient matrix decomposition, then the feature space can be divided into KPI-related and KPI-unrelated parts, which are monitoring by the Hotelling T2 statistic and SPE statistic. The efficiency of the two proposed methods for state monitoring is verified by the Tennessee Eastman process.

The contributions of this paper are as follows:

(1) Two enhanced and effective KCCA methods are proposed for KPI-related state monitoring of nonlinear processes.

(2) The fault detectability of the two proposed methods is analyzed and a sufficient condition is proved.

Section snippets

Kernel canonical correlation analysis

KCCA generally has two steps: offline modeling, which is used to build relationship between process data and quality data; online monitoring, which is used to calculate the statistics indices online.

Modified KCCA

According to (12), the expected output can be obtained yˆ=k̄newAΛkTB1=k̄newMwhere M=AΛkTB1, notice (15) shows the relational expression between the expected output and the input, so Mn×l is the correlation matrix between the expected output and the input, which can be obtained by least squares (LS). KCCA cannot distinguish whether the fault is related to KPIs or not, so the operators cannot give priority to the KPI-related faults. Considering this issue, MKCCA method is adopted. MKCCA

Key performance indicator-kernel canonical correlation analysis

This section introduces another method to divide K̄ into KPI-related and KPI-unrelated parts. The method is performing GSVD on matrices which are obtained from correlation matrix M. Then (15) can be rewritten as yˆ=k̄newM=k̄newABΛkT11=k̄newAH1where H=BΛkT1l×l, notice that matrix An×l and matrix H have the same column dimension, that matrices ATA and HTH have the same dimensions, so the following equations can be obtained A=UATΣAWT,H=UHTΣHWTUH,UA,W,ΣH,ΣA=gsvdH,Awhere GSVD indicates

Fault detectability analysis

The following sensor fault model (Dunia, Qin, Edgar, & McAvoy, 1996) is used in this study x=x+fiξiwhere x is the faulty samples, x is the normal samples, fi is the fault magnitude, ξi is the scale factor vector, which represents the fault fi contained in each variable.

In this paper, considering the introduction of Gaussian kernel function in fault detection, then we assume the observation is influenced by a fault along direction ξi, one can obtain Kallas, Mourot, Anani, Ragot, and Maquin

Simulation test

In this study, KPI-related and KPI-unrelated subspaces are monitored by T2 and SPE statistics, respectively. The performance of the two proposed methods for state monitoring is reflected by the fault detection rate (FDR) and the fault alarm rate (FAR). FDR and FAR are defined as (Jiao et al., 2017) FDR=probT2>Jth|f0FAR=probT2>Jth|f=0where Jth is control limit threshold, f0 represents the KPI-related fault occurs and f=0 represents the KPI-unrelated fault occurs.

Conclusions

In this paper, two novel methods of nonlinear KPI-related state monitoring based on KCCA are proposed. The two methods divide the feature space by different decomposition methods and performance is different. Here the TE process is used to verify the efficacy of the two methods, through comparing with other methods, the two proposed methods perform better. KPI-KCCA is performing better than MKCCA for state monitoring, however KPI-KCCA off-line modeling is more difficult than MKCCA. However,

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (32)

  • ZhuQ. et al.

    Concurrent quality and process monitoring with canonical correlation analysis

    Journal of Process Control

    (2017)
  • BachF.R. et al.

    Kernel independent component analysis

    Journal of Machine Learning Research

    (2003)
  • ChenZ.

    Data-driven fault detection for industrial processes

    (2017)
  • DuniaR. et al.

    Identification of faulty sensors using principal component analysis

    AIChE Journal

    (1996)
  • GeZ. et al.

    Review of recent research on data-based process monitoring

    Industrial and Engineering Chemistry Research

    (2013)
  • GolubG.H. et al.

    Matrix computations

    (1983)
  • Cited by (0)

    This study was supported by the National Natural Science Foundation of China under Grant 61822308, Shandong Province Natural Science Foundation, China under Grant JQ201812, and Program for Entrepreneurial and Innovative Leading Talents of Qingdao, China under Grant 19-3-2-4-zhc.

    View full text