Automatic determination of optimal fault detection filter

doi:10.1016/j.jprocont.2022.08.009

Journal of Process Control

Volume 118, October 2022, Pages 69-81

https://doi.org/10.1016/j.jprocont.2022.08.009 Get rights and content

Highlights

•
Enabling automatic determination of optimal fault detection filter.
•
Using kernel density estimation to refine the thresholds.
•
Enhancing fault detection performance without using fault data.
•
Performance validated in Tennessee Eastman benchmark.

Abstract

Optimal detection filters can greatly enhance fault detection performance, but designing these filters requires fault data which is difficult to obtain in practice. This paper proposes a scheme that automatically determines the optimal detection filter from a filter bank online without using fault data. The method can improve fault detection rate and accelerate detection speed. In order to reduce the false alarm rate, a method of threshold setting is introduced based on kernel density estimation. Implementation issues concerning filter bank design and online decision rule are also discussed. The method is validated in a numerical example and Tennessee Eastman process, and its performance is compared to those of other state-of-the-art methods.

Introduction

The growing complexity and strict performance requirements of modern technical systems have made fault diagnosis an important issue in both scientific and engineering research. Fault diagnosis is a multi-step process including typically: fault detection, isolation and identification [1], among which fault detection (FD) is the initial and the most important step that determines the success of fault diagnosis.

Model-based and data-driven methods are the two most-investigated branches in process control community, and they adopt entirely different methodologies. Classical model-based methods utilize first-principle state–space models and observer techniques to generate residuals then perform FD [1], where the dynamics of the plant are well considered and the methods can be made robust to disturbance and model error. However, these works generally assume that the nominal model and the bounds concerning uncertain system are known, which is unrealistic in practice. Besides the deterministic settings above, there are also stochastic methods using Kalman filter, e.g., [2], [3], [4].

Plant models in process industries are highly complex, which makes first-principals based modeling a demanding and costly task. Alternatively, many researchers try to utilize the large amount of historical data to extract features and information for FD, known as data-driven or machine learning based methods [5]. Data-driven methods can further be divided into unsupervised and supervised ones according to whether they use labeled datasets. In unsupervised methods, all relevant variables are combined into one high-dimensional vector, then a latent variable (LV) subspace with low dimension is constructed to preserve the main information. By checking the consistency to normal (fault-free) variations in this reduced-dimensional subspace (principal subspace) and its complement (residual subspace), fault detection can be performed. Various methods can be developed by introducing different physical meanings on LVs, e.g., principle component analysis (PCA, LVs having as much variances as possible) [6], slow feature analysis (SFA, LVs as slowly varying as possible) [7]. When there are casual relations between variables, one can use the LVs extracted from input signals to predict output signals, resulting in the supervised methods representative of partial least square (PLS) [6], canonical correlation analysis (CCA) [8]. Latent variable method (LVM) mentioned above can also be modified to their dynamic versions [9], in which LVs are designed to be correlated in time to better model the dynamic process.

Notice that besides the supervised methods using continuous labeled datasets above, there are also methods using discrete labeled data which leads to the classification-based methods. Unlike supervised LVMs that only require normal data, classification-based methods require also fault data. After training with datasets labeled of normal and different fault types, the model should output which fault occurs (or no fault) when online measurement enters. Popular methods to handle such problem are: Fisher discriminant analysis (FDA) [10], support vector machine (SVM) [11], [12] and artificial neural networks (ANN) [13], [14].

Compared to model-based methods, data-driven methods may provide more practical solutions because building data-driven models requires less process knowledge and not too much computation power. However, there are many issues which deserve investigation:

(1)
In view of mathematical modeling, process models should have the structure of partial or ordinary differential equations, while there is no direct link from these models to data-driven ones;
(2)
For plants under closed-loop control and corrupted by noise, only by adding persistently exciting test signals can the full system behavior be revealed [15], [16], normal data without excitation cannot provide enough information;
(3)
Fault data are rare or even unavailable in real applications, which poses challenges for methods using fault data;
(4)
Most data-driven methods only utilize time-domain information while there is also frequency-domain information that are useful to improve the detection performance.

System identification is a data-driven modeling technique that can make useful contribution in fault diagnosis. It actively adds test signals to the plant to generate informative data, then delivers dynamic transfer-function or state–space models, which are equivalent to ordinary differential equations. Although identified models provide less information about processes than first-principle models (if available), they directly reveal the dynamic relations between process variables. Compared to dynamic LVMs, they are closer to process dynamics and are more interpretable. Hence identification-based fault detection methods [17], [18], [19] are very helpful in solving issues (1) and (2).

As an attempt in issue (4), an optimal filter framework was recently proposed in [19]. The optimal filters are designed according to frequency-domain information about faults and disturbances. After using such filters, the detection performance of residuals can be greatly enhanced. In Tennessee Eastman process (TEP) benchmark, the method achieved the highest detection performance among many data-driven methods and subspace-aided approach (SAP). However, the reliance on fault data becomes a restriction according to issue (3).

Following the same line of [19], this paper aims at overcoming the fault data reliance and improving the applicability of the optimal filter framework. Specifically, the paper has following contributions: (1) it proposes a scheme that automatically determines an optimal detection filter online without using fault data and that can handle faults with varying spectra; (2) a method of threshold setting based on kernel density estimation (KDE) and an online decision rule are developed to further refine the detection performance.

The paper is organized as follows: Section 2 gives basics on system description, fault detection and a brief review on the optimal detection filter; Section 3 contains the main results, where the automatic determination scheme of optimal detection filter is developed, its detection performance is analyzed, and a threshold setting method is proposed; Section 4 discusses implementation of the method; Section 5 uses a numerical example to illustrate the method; Section 6 applies the method to TEP benchmark and compares the method to other state-of-the-art ones, detailed analysis of four specific faults are also given therein; Section 7 is the conclusion.

Section snippets

Fault detection

This paper considers multi-input multi-output (MIMO) linear time-invariant (LTI) systems with the type: $y (t) = G (q) u (t) + v (t)$ where $G (q)$ is the system transfer matrix with dimension $n_{y} \times n_{u}$ , $u (t) \in R^{n_{u}}$ , $y (t) \in R^{n_{y}}$ , $v (t) \in R^{n_{y}}$ denote input, output and disturbance signals respectively. The system can either be open-loop or under feedback control. Based on a system model $\hat{G} (q)$ and input–output signals, output error (OE) residual in the normal case writes: $r^{n} (t) = y (t) - \hat{G} (q) u (t) = Δ G (q) u (t) + v (t)$ where $Δ G (q) ≔ G (q) - \hat{G} ($

Automatic determination of optimal fault detection filter

In this section, a framework using filter bank is proposed, which automatically determines the best approximation of the ideal optimal filter online. Moreover, a new threshold setting method using kernel density estimation (KDE) is developed in order to reduce FAR.

Implementation issues

This section discusses implementation issues, including the filter bank design, online decision rule and summary of the method for better illustration.

A numerical example

This section studies an open-loop MIMO system like (1). The goal is to give illustration and help to understand the proposed method. The normal system is given by: $G (q) = (\begin{matrix} \frac{0.0115 q^{- 1} + 0.00639 q^{- 2}}{1 - 1.963 q^{- 1} + 0.965 q^{- 2}} & \frac{0.0153 q^{- 2} + 0.019 q^{- 3}}{1 - 1.67 q - 1 + 0.697 q^{- 2}} \\ \frac{q^{- 1} + 0.5 q^{- 2}}{1 - 1.5 q^{- 1} + 0.7 q^{- 2}} & \frac{2.808 q^{- 3} - 0.968 q^{- 4}}{1 - 1.838 q - 1 + 0.887 q^{- 2}} \end{matrix}),$ $v (t) = (\begin{matrix} \frac{1 - 0.741 q^{- 1} + 0.0189 q^{- 2}}{1 - 1.905 q^{- 1} + 0.906 q^{- 2}} & 0 \\ 0 & \frac{1 - 0.558 q^{- 1} - 0.391 q^{- 2}}{1 - 1.929 q - 1 + 0.932 q^{- 2}} \end{matrix}) e (t),$ $u (t) = (\begin{matrix} \frac{1}{1 - 0.88 q^{- 1}} & 0 \\ 0 & \frac{1 - 0.3 q^{- 1}}{1 - 1.94 q - 1} \end{matrix}) w (t),$ $e (t) \sim N (0, (\begin{matrix} 0.2 5^{2} & 0 \\ 0 & 0 . 8^{2} \end{matrix})), w (t) \sim N (0, (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix})) .$

Benchmark example: Tennessee Eastman process

This sections studies the application to TEP. TEP model is a realistic simulation program of a chemical plant which is widely accepted as a benchmark for fault detection and diagnosis. Five major units are contained in TEP: reactor, condenser, compressor, separator, and stripper. The total 52 variables containing 41 process measurements and 12 manipulated variables are listed in Table 3, in which

denotes process measurement

i

and

denotes manipulated variable

i

. The study in this section is

Conclusion

This paper proposes a scheme that automatically determines the optimal detection filter without using fault data. The method can automatically select the most suitable filter from a filter bank that improves FDR and accelerates MT2D. A threshold setting method based on KDE is proposed in order to reduce FAR. The design of the filter bank and an online decision rule are also provided. In the case study of TEP, the proposed OEFB gives the best detection performance. Combining with system

CRediT authorship contribution statement

Jinming Zhou: Methodology, Formal analysis, Software, Validation, Writing – original draft. Yucai Zhu: Conceptualization, Supervision, Writing – review & editing, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (38)

ZhangQ.
Adaptive Kalman filter for actuator fault diagnosis
Automatica
(2018)
QinS.J. et al.
Bridging systems theory and data science: A unifying review of dynamic latent variable analytics and process monitoring
Annu. Rev. Control
(2020)
ChenG. et al.
SVM-tree and SVM-forest algorithms for imbalanced fault classification in industrial processes
IFAC J. Syst. Control
(2019)
WuH. et al.
Deep convolutional neural network model based chemical process fault diagnosis
Comput. Chem. Eng.
(2018)
WillemsJ.C. et al.
A note on persistency of excitation
Systems Control Lett.
(2005)
DingS.X. et al.
Subspace method aided data-driven design of fault detection and isolation systems
J. Process Control
(2009)
ZhouJ. et al.
Identification based fault detection: Residual selection and optimal filter
J. Process Control
(2021)
DingS.X. et al.
Application of randomized algorithms to assessment and design of observer-based fault detection systems
Automatica
(2019)
ZhouJ. et al.
Fault isolation based on transfer-function models using an MPC algorithm
Comput. Chem. Eng.
(2022)
YinS. et al.
A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process
J. Process Control
(2012)

LymanP.R. et al.

Plant-wide control of the Tennessee eastman problem

Comput. Chem. Eng.

(1995)

DownsJ.J. et al.

A plant-wide industrial process control problem

Comput. Chem. Eng.

(1993)

ZhangZ. et al.

Gaussian feature learning based on variational autoencoder for improving nonlinear process monitoring

J. Process Control

(2019)

DongY. et al.

A novel dynamic PCA algorithm for dynamic data modeling and process monitoring

J. Process Control

(2018)

YinS. et al.

Study on modifications of PLS approach for process monitoring

IFAC Proc. Vol.

(2011)

GaoX. et al.

Dynamic system modelling and process monitoring based on long-term dependency slow feature analysis

J. Process Control

(2021)

WuD. et al.

Multimode process monitoring based on fault dependent variable selection and moving window-negative log likelihood probability

Comput. Chem. Eng.

(2020)

SunW. et al.

Fault detection and identification using Bayesian recurrent neural networks

Comput. Chem. Eng.

(2020)

LuQ. et al.

Sparse canonical variate analysis approach for process monitoring

J. Process Control

(2018)

Cited by (3)

A combined passive-active method for diagnosing multiplicative fault
2023, Process Safety and Environmental Protection
Fault detection and diagnosis (FDD) plays an important role in risk and safety management system. According to the ways faults influence the actual system, they can be divided into additive (mainly actuator and sensor faults) and multiplicative (mainly process faults). The two types of faults should be diagnosed with specific methods then handled with different maintenance strategies. This paper presents a combined passive-active fault diagnosis method, which allows a simultaneous consideration of additive and multiplicative faults and to distinguish between them. It also enables detailed diagnosis that assists to enhance the subsequent risk assessment and management. System identification is used as the modeling tool and forms the basis of the method. The passive-active feature of the method reflects in that: it uses online generated residual as a fault indicator for real-time monitoring; it also uses test signals to magnify the fault characteristics and helps to reveal the fault location. Specifically, a method is proposed to distinguish between additive and multiplicative fault according to the different residual behavior after adding test signals. Once fault type is determined, by investigating the identified models with error bounds, a method is further developed to determine the location of multiplicative fault in the multi-variable system. The statistical properties of the identified models are utilized to generate perturbed realizations of the model and derive probabilistic bounds of model errors, enabling both methods to deal with model errors. The proposed method does not require to break the control loops when adding test signals and does not require fault data/model to start with. The effectiveness of the proposed method is validated through a numerical example and Tennessee Eastman process (TEP).
Distributed statistical process monitoring based on block-wise residual generator
2023, Journal of Chemometrics
On the optimality of Kalman Filter for Fault Detection
2023, arXiv

^☆: This work is supported by the National Natural Science Foundation of China , Grant Numbers: U1809207, 61673343.

View full text

Automatic determination of optimal fault detection filter☆

Highlights

Abstract

Introduction

Section snippets

Fault detection

Automatic determination of optimal fault detection filter

Implementation issues

A numerical example

Benchmark example: Tennessee Eastman process

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Automatica

Annu. Rev. Control

IFAC J. Syst. Control

Comput. Chem. Eng.

Systems Control Lett.

J. Process Control

J. Process Control

Automatica

Comput. Chem. Eng.

J. Process Control

Comput. Chem. Eng.

Comput. Chem. Eng.

J. Process Control

J. Process Control

IFAC Proc. Vol.

J. Process Control

Comput. Chem. Eng.

Comput. Chem. Eng.

J. Process Control