Elsevier

Journal of Process Control

Volume 118, October 2022, Pages 69-81
Journal of Process Control

Automatic determination of optimal fault detection filter

https://doi.org/10.1016/j.jprocont.2022.08.009Get rights and content

Highlights

  • Enabling automatic determination of optimal fault detection filter.

  • Using kernel density estimation to refine the thresholds.

  • Enhancing fault detection performance without using fault data.

  • Performance validated in Tennessee Eastman benchmark.

Abstract

Optimal detection filters can greatly enhance fault detection performance, but designing these filters requires fault data which is difficult to obtain in practice. This paper proposes a scheme that automatically determines the optimal detection filter from a filter bank online without using fault data. The method can improve fault detection rate and accelerate detection speed. In order to reduce the false alarm rate, a method of threshold setting is introduced based on kernel density estimation. Implementation issues concerning filter bank design and online decision rule are also discussed. The method is validated in a numerical example and Tennessee Eastman process, and its performance is compared to those of other state-of-the-art methods.

Introduction

The growing complexity and strict performance requirements of modern technical systems have made fault diagnosis an important issue in both scientific and engineering research. Fault diagnosis is a multi-step process including typically: fault detection, isolation and identification [1], among which fault detection (FD) is the initial and the most important step that determines the success of fault diagnosis.

Model-based and data-driven methods are the two most-investigated branches in process control community, and they adopt entirely different methodologies. Classical model-based methods utilize first-principle state–space models and observer techniques to generate residuals then perform FD [1], where the dynamics of the plant are well considered and the methods can be made robust to disturbance and model error. However, these works generally assume that the nominal model and the bounds concerning uncertain system are known, which is unrealistic in practice. Besides the deterministic settings above, there are also stochastic methods using Kalman filter, e.g., [2], [3], [4].

Plant models in process industries are highly complex, which makes first-principals based modeling a demanding and costly task. Alternatively, many researchers try to utilize the large amount of historical data to extract features and information for FD, known as data-driven or machine learning based methods [5]. Data-driven methods can further be divided into unsupervised and supervised ones according to whether they use labeled datasets. In unsupervised methods, all relevant variables are combined into one high-dimensional vector, then a latent variable (LV) subspace with low dimension is constructed to preserve the main information. By checking the consistency to normal (fault-free) variations in this reduced-dimensional subspace (principal subspace) and its complement (residual subspace), fault detection can be performed. Various methods can be developed by introducing different physical meanings on LVs, e.g., principle component analysis (PCA, LVs having as much variances as possible) [6], slow feature analysis (SFA, LVs as slowly varying as possible) [7]. When there are casual relations between variables, one can use the LVs extracted from input signals to predict output signals, resulting in the supervised methods representative of partial least square (PLS) [6], canonical correlation analysis (CCA) [8]. Latent variable method (LVM) mentioned above can also be modified to their dynamic versions [9], in which LVs are designed to be correlated in time to better model the dynamic process.

Notice that besides the supervised methods using continuous labeled datasets above, there are also methods using discrete labeled data which leads to the classification-based methods. Unlike supervised LVMs that only require normal data, classification-based methods require also fault data. After training with datasets labeled of normal and different fault types, the model should output which fault occurs (or no fault) when online measurement enters. Popular methods to handle such problem are: Fisher discriminant analysis (FDA) [10], support vector machine (SVM) [11], [12] and artificial neural networks (ANN) [13], [14].

Compared to model-based methods, data-driven methods may provide more practical solutions because building data-driven models requires less process knowledge and not too much computation power. However, there are many issues which deserve investigation:

  • (1)

    In view of mathematical modeling, process models should have the structure of partial or ordinary differential equations, while there is no direct link from these models to data-driven ones;

  • (2)

    For plants under closed-loop control and corrupted by noise, only by adding persistently exciting test signals can the full system behavior be revealed [15], [16], normal data without excitation cannot provide enough information;

  • (3)

    Fault data are rare or even unavailable in real applications, which poses challenges for methods using fault data;

  • (4)

    Most data-driven methods only utilize time-domain information while there is also frequency-domain information that are useful to improve the detection performance.

System identification is a data-driven modeling technique that can make useful contribution in fault diagnosis. It actively adds test signals to the plant to generate informative data, then delivers dynamic transfer-function or state–space models, which are equivalent to ordinary differential equations. Although identified models provide less information about processes than first-principle models (if available), they directly reveal the dynamic relations between process variables. Compared to dynamic LVMs, they are closer to process dynamics and are more interpretable. Hence identification-based fault detection methods [17], [18], [19] are very helpful in solving issues (1) and (2).

As an attempt in issue (4), an optimal filter framework was recently proposed in [19]. The optimal filters are designed according to frequency-domain information about faults and disturbances. After using such filters, the detection performance of residuals can be greatly enhanced. In Tennessee Eastman process (TEP) benchmark, the method achieved the highest detection performance among many data-driven methods and subspace-aided approach (SAP). However, the reliance on fault data becomes a restriction according to issue (3).

Following the same line of [19], this paper aims at overcoming the fault data reliance and improving the applicability of the optimal filter framework. Specifically, the paper has following contributions: (1) it proposes a scheme that automatically determines an optimal detection filter online without using fault data and that can handle faults with varying spectra; (2) a method of threshold setting based on kernel density estimation (KDE) and an online decision rule are developed to further refine the detection performance.

The paper is organized as follows: Section 2 gives basics on system description, fault detection and a brief review on the optimal detection filter; Section 3 contains the main results, where the automatic determination scheme of optimal detection filter is developed, its detection performance is analyzed, and a threshold setting method is proposed; Section 4 discusses implementation of the method; Section 5 uses a numerical example to illustrate the method; Section 6 applies the method to TEP benchmark and compares the method to other state-of-the-art ones, detailed analysis of four specific faults are also given therein; Section 7 is the conclusion.

Section snippets

Fault detection

This paper considers multi-input multi-output (MIMO) linear time-invariant (LTI) systems with the type: y(t)=G(q)u(t)+v(t)where G(q) is the system transfer matrix with dimension ny×nu, u(t)Rnu, y(t)Rny, v(t)Rny denote input, output and disturbance signals respectively. The system can either be open-loop or under feedback control. Based on a system model Gˆ(q) and input–output signals, output error (OE) residual in the normal case writes: rn(t)=y(t)Gˆ(q)u(t)=ΔG(q)u(t)+v(t)where ΔG(q)G(q)Gˆ(

Automatic determination of optimal fault detection filter

In this section, a framework using filter bank is proposed, which automatically determines the best approximation of the ideal optimal filter online. Moreover, a new threshold setting method using kernel density estimation (KDE) is developed in order to reduce FAR.

Implementation issues

This section discusses implementation issues, including the filter bank design, online decision rule and summary of the method for better illustration.

A numerical example

This section studies an open-loop MIMO system like (1). The goal is to give illustration and help to understand the proposed method. The normal system is given by: G(q)=0.0115q1+0.00639q211.963q1+0.965q20.0153q2+0.019q311.67q1+0.697q2q1+0.5q211.5q1+0.7q22.808q30.968q411.838q1+0.887q2, v(t)=10.741q1+0.0189q211.905q1+0.906q20010.558q10.391q211.929q1+0.932q2e(t), u(t)=110.88q10010.3q111.94q1w(t), e(t)N0,0.252000.82,w(t)N0,1001.

Benchmark example: Tennessee Eastman process

This sections studies the application to TEP. TEP model is a realistic simulation program of a chemical plant which is widely accepted as a benchmark for fault detection and diagnosis. Five major units are contained in TEP: reactor, condenser, compressor, separator, and stripper. The total 52 variables containing 41 process measurements and 12 manipulated variables are listed in Table 3, in which

denotes process measurement i and
denotes manipulated variable i. The study in this section is

Conclusion

This paper proposes a scheme that automatically determines the optimal detection filter without using fault data. The method can automatically select the most suitable filter from a filter bank that improves FDR and accelerates MT2D. A threshold setting method based on KDE is proposed in order to reduce FAR. The design of the filter bank and an online decision rule are also provided. In the case study of TEP, the proposed OEFB gives the best detection performance. Combining with system

CRediT authorship contribution statement

Jinming Zhou: Methodology, Formal analysis, Software, Validation, Writing – original draft. Yucai Zhu: Conceptualization, Supervision, Writing – review & editing, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (38)

Cited by (3)

  • A combined passive-active method for diagnosing multiplicative fault

    2023, Process Safety and Environmental Protection

This work is supported by the National Natural Science Foundation of China , Grant Numbers: U1809207, 61673343.

View full text