Comparison of advanced set-based fault detection methods with classical data-driven and observer-based methods for uncertain nonlinear processes

doi:10.1016/j.compchemeng.2022.107975

Computers & Chemical Engineering

Volume 166, October 2022, 107975

https://doi.org/10.1016/j.compchemeng.2022.107975 Get rights and content

Highlights

•
Presents a new set-based fault detection method with improved fault sensitivity.
•
Provides detailed comparison if set-based methods with classical methods.
•
Classical methods detect faults faster but suffer from frequent false alarms.
•
Set-based fault detection methods can guarantee a zero false alarm rate.
•
Set-based methods are more robust to disturbances, model uncertainty, and transients.

Abstract

Automated fault detection (FD) methods are essential for safe and profitable operation of complex engineered systems. Both data-driven and model-based methods have been extensively studied, and some are widely used in practice. However, distinguishing faults from acceptable process variations remains a critical challenge, making both false alarms and missed faults commonplace. In principle, set-based FD methods can rigorously address this challenge. However, existing methods are often much too conservative, particularly for nonlinear systems. Moreover, few if any published comparisons clearly demonstrate the supposed advantages of set-based methods relative to conventional methods. This paper first presents a new set-based FD method based on discrete-time differential inequalities and demonstrates increased fault sensitivity through several case studies. Next, a detailed comparison of set-based methods with representative data-driven and model-based approaches is presented. The results verify some key advantages of the set-based approaches, but also highlight key challenges for future work.

Introduction

Due to the level of complexity, integration, and automation in modern engineered systems, equipment malfunctions and other abnormal events are frequent and unavoidable. These events, termed faults, often have serious economic, safety, and environmental consequences if not detected quickly (Venkatasubramanian et al., 2003b). At the same time, false alarms (i.e., fault declarations during normal operation) caused by benign disturbances can lead to shut-downs or other operational changes that also do significant economic harm. Thus, automated methods for detecting faults quickly and accurately are essential. This paper introduces a new set-based fault detection algorithm based on discrete-time differential inequalities and presents a detailed comparison of set-based fault detection methods with conventional data-driven and model-based approaches.

To date, the theory and practice of fault detection (FD) has been dominated by data-driven approaches (Chiang et al., 2000, Venkatasubramanian et al., 2003b). In these methods, historical data is analyzed to identify important statistics, and the variability of these statistics under normal operating conditions is quantified in terms of thresholds. Online, new measurements are compared with the historical data and a fault is declared if the current statistics violate the computed thresholds. In ideal cases (e.g., stationary Gaussian data), well-established statistical methods such as principal component analysis (PCA) can be used to identify appropriate statistics and set thresholds to achieve any desired rate of false alarms (Chiang et al., 2000). These methods are simple, scalable, and widely used in industry (Joe Qin, 2003). However, for general non-Gaussian data, these simple methods can result in inappropriate statistics or thresholds, often leading to high false alarm rates. Several advanced data-driven methods based on independent component analysis (ICA), dynamic PCA, kernel PCA, and other machine learning techniques have been developed to address this, but these have considerably higher costs and are not well established in practice (Lee et al., 2006, Ku et al., 1995, Choi et al., 2005, Lee et al., 2007). Another significant disadvantage of all data-driven methods is that they require historical data that is appropriate for the current operating point. If the current operating point is different from that of the historical data, either intentionally or due to a persistent disturbance or transient, the process statistics can deviate significantly from historical values, leading to persistent false alarms that render the method unusable (see experiments in Section 5).

To address these limitations, another class of FD methods makes use of process models in place of historical data (Venkatasubramanian et al., 2003b, Isermann, 2005, Patton and Chen, 1997). The most standard approach is to use an observer (i.e., state estimator) to predict the most likely output values at the next sampling time. A fault is then declared if the measured outputs deviate from the predictions by more than a prescribed threshold (Patton and Chen, 1997). This eliminates the need for historical data at the current operating point and naturally handles non-steady operations. In principle, the model also provides a means to completely characterize the output statistics under fault-free conditions, which is crucial for setting thresholds that accurately distinguish faults from disturbances. However, this requires both an accurate model and accurate knowledge of the disturbance probability distributions, which is often impractical. Moreover, even when these distributions are available, propagating them through a nonlinear model to obtain output distributions is extremely difficult. Thus, significant approximations are needed, such as the use of linearization and Gaussian distributions in methods based on extended Kalman filters (Patton and Chen, 1997). In general, this can lead to inaccurate output statistics and, as a result, fault insensitivity or excessive false alarms (see experiments in Section 5).

A third class of FD methods called set-based methods attempts to address the shortcomings of observer-based methods by modeling all disturbances and measurement noises in terms of deterministic bounds rather than probability distributions. Specifically, these inputs are assumed to be bounded within known compact sets, but nothing is assumed about their distributions. Set-based computations are then used to rigorously test if a new measured output is consistent with the process model given these bounds, and a fault is declared if not. This approach is attractive because obtaining bounds on disturbances and measurement noises is often easier than obtaining accurate probability distributions. Moreover, model uncertainty can be easily incorporated using bounded time-invariant parameters, which lessens the need for a highly accurate model. Finally, these methods completely eliminate false alarms provided that the input bounds are valid. However, accurate set-based computations are required to achieve high fault sensitivity, which remains a major challenge.

Many set-based FD methods are available for linear systems using computations with intervals (Seron and De Doná, 2009, Efimov et al., 2013), polytopes (Blesa et al., 2012), ellipsoids (Reppa and Tzes, 2011), zonotopes (Scott et al., 2013, Ingimundarson et al., 2009, Tabatabaeipour et al., 2012), and constrained zonotopes (Scott et al., 2016). However, testing the consistency of a measured output with a nonlinear model is significantly more difficult. One approach is to use set-based parameter estimation, wherein measurements are used to compute an enclosure of the set of consistent model parameters. A fault is then declared when this enclosure has no overlap with a known set of possible parameter values. In Jauberthie et al. (2013), this is done using a differential algebraic approach and interval-based set inversion techniques. However, the computational cost scales exponentially with the number of uncertain parameters. This method is extended to systems with probabilistic noises using a Bayesian framework in Fernández-Cantí et al. (2013), but this does not provide rigorous bounds.

Another approach to set-based FD is to apply set-based state estimation. At each sampling time, a set-based state estimator provides a guaranteed enclosure of the set of states consistent with the model, the bounded uncertainties, and all past measurements. This is then used to compute an enclosure of the possible model outputs, and a fault is declared if the measured output is outside of this set. The key challenge is to compute sufficiently accurate enclosures fast enough for online fault detection. A method for continuous-time systems based on upper and lower Luenberger observers and cooperativity theory is proposed in Raïssi et al. (2010). In Combastel (2016), an Extended Zonotopic and Gaussian Kalman Filter is proposed for discrete-time systems with uncertainties composed of bounded and unbounded parts. However, both methods rely on conservative linearizations of the dynamics over the entire state domain, which can lead to weak enclosures compared to adaptive linearizations such as those in Alamo et al., 2005, Combastel, 2005. The FD method in Rostampour et al. (2017) also uses a set-based state estimator, but forgoes rigorous enclosures in favor of smaller sets based on a prescribed false alarm rate. Thus, this method is not guaranteed to avoid false alarms. Moreover, it requires the solution of nonlinear chance constrained optimization problems at each sampling time, which is prohibitive. To reduce conservatism and increase efficiency, some approaches use approximate models with simpler structure. In Wang and Puig (2016), nonlinear models are linearized before constructing the observer, as in the extended Kalman filter. Similarly, (Chai et al., 2013) approximates nonlinear input–output models using a Takagi–Sugeno fuzzy neural network that is linear in the uncertain parameters. Ellipsoidal (Chai et al., 2013) and zonotopic (Wang and Puig, 2016) enclosures are then computed for the approximate models and used for fault detection. However, these enclosures are not necessarily valid for the original system. Finally, Tulsyan and Barton (2016) proposes a set-based FD method for continuous-time systems using advanced reachable set bounding techniques based on differential inequalities. However, measurements are not used to refine the predicted enclosures as in a true set-based state estimator, which is a serious limitation.

Despite this prior work, performing set-based computations with sufficient accuracy for effective fault detection remains a major challenge for nonlinear systems. Moreover, although the potential advantages of set-based FD methods have been articulated in many prior studies, to the best of our knowledge, no detailed studies comparing set-based FD methods to more conventional data-driven and observer-based methods are available in the literature. In this context, this article makes two main contributions. First, we present a new set-based FD algorithm based on the set-based state estimator recently developed in Yang and Scott (2018a). This estimator computes interval enclosures using the theory of discrete-time differential inequalities (DTDI) and has been shown to produce significantly more accurate enclosures than other state-of-the-art set-based state estimators for nonlinear test cases in Yang and Scott (2018a). However, this method has not previously been applied for FD. To this end, we develop a new FD algorithm using DTDI and demonstrate through case studies that it offers significantly improved fault sensitivity.

Second, we present a detailed comparison of set-based FD methods with more conventional data-driven and observer-based methods using three case studies. Specifically, we compare against the standard PCA method described in Joe Qin (2003) and the extend Kalman filter (EKF) method from Fathi et al. (1993). These methods were chosen because they are representative of the state-of-practice in data-driven and observer-based FD, respectively. We acknowledge that many alternative methods exist and may perform better in specific scenarios. Yet, none are as well established or widely used, and so it seems most informative to first understand how set-based methods compare to these classical benchmarks.

Comparing data-driven, observer-based, and set-based FD methods is challenging and somewhat ill-posed because they are based on fundamentally different assumptions about the process noises and require different information about the process that may not be accurately known in practice. Thus, any comparison necessarily involves applying the methods in cases that violate some of their assumptions. Yet, this is exactly the case when applying these methods to real systems, so it is important to study how they perform in such circumstances. To this end, we develop a framework for comparing data-driven, observer-based, and set-based FD methods using simulated systems with various noise distributions, various types of model uncertainty, various fault-free scenarios representing different operating conditions, and various types of faults. In every case, we make the most sensible approximation of how each method would be applied in practice without knowledge of the actual process and noise distributions used in the simulation.

The remainder of this paper is organized as follows. Section 2 gives a formal problem statement. Section 3 describes the representative data-driven and observer-based FD methods selected for our comparisons. Section 4 introduces the set-based FD paradigm, describes existing methods we compare against, and presents our new set-based FD algorithm. In Sections 5, 6, and 7, detailed comparisons of all FD methods are provided using three case studies. Finally, Section 8 provides further discussion and concluding remarks.

Section snippets

Problem statement

This paper considers FD algorithms for uncertain nonlinear discrete-time systems of the form $x_{k + 1} = f (k, x_{k}, w_{k}),$ $y_{k} = g (k, x_{k}, v_{k}) .$ Above, $x_{k} \in R^{n_{x}}$ is the state, $y_{k} \in R^{n_{y}}$ is the output, $w_{k} \in R^{n_{w}}$ is the disturbance, $v_{k} \in R^{n_{v}}$ is the measurement noise, and $k \in K \equiv {0, \dots, K}$ . The functions $f$ and $g$ have the form $f : K \times R^{n_{x}} \times R^{n_{w}} \to R^{n_{x}}$ and $g : K \times R^{n_{x}} \times R^{n_{v}} \to R^{n_{y}}$ .

Different FD methods make different assumptions about the nature of the noises $w$ and $v$ and the required properties of the functions $f$ and $g$ . The assumptions for the

Conventional fault detection methods

This section describes the basic principles of the conventional FD methods compared in Sections 5–7.

Set-based fault detection

In this section, we present a generic algorithm for set-based FD that makes use of a set-based state estimator. We then describe five distinct methods that arise from applying this algorithm with different state estimators.

In set-based methods, the initial conditions, disturbances, and measurement noises are assumed to be bounded by known compact sets: $(x_{0}, w_{k}, v_{k}) \in C_{0} \times W \times V, \forall k \in K .$ Let $y_{0 : K} = (y_{0}, \dots, y_{K})$ denote an observed output sequence. The goal of set-based state estimation is to characterize the sets

Fault detection in a CSTR

Consider the following model of a continuous stirred tank reactor (CSTR) from (Shen and Scott, 2017): $x_{1, k + 1} = x_{1, k} + h [- u_{3, k} x_{1, k} x_{2, k} - k_{2} x_{1, k} x_{3, k} + τ^{- 1} (u_{1, k} - 2 x_{1, k})],$ $x_{2, k + 1} = x_{2, k} + h [- u_{3, k} x_{1, k} x_{2, k} + τ^{- 1} (u_{2, k} - 2 x_{2, k})],$ $x_{3, k + 1} = x_{3, k} + h [u_{3, k} x_{1, k} x_{2, k} - k_{2} x_{1, k} x_{3, k} - 2 τ^{- 1} x_{3, k}],$ $x_{4, k + 1} = x_{4, k} + h [k_{2} x_{1, k} x_{3, k} - 2 τ^{- 1} x_{4, k}],$ $y_{1, k} = x_{2, k} + v_{1, k},$ $y_{2, k} = x_{3, k} + v_{2, k},$ $y_{3, k} = x_{4, k} + v_{3, k} .$ Above, $x_{i}$ is the concentration (M) of species $i$ , $y_{i}$ is the measurement of $x_{i}$ , $u_{i}$ and $v_{i}$ are disturbances and measurement noises (specified further in each scenario

Fault detection in a batch reactor

The following dynamics describe a six-species enzymatic reaction in a batch reactor, where $x_{i}$ represents the concentration (M) of species $i$ (Scott and Barton, 2013): $x_{1, k + 1} = x_{1, k} + h [- k_{1, k} x_{1, k} x_{2, k} + k_{2, k} x_{3, k} + k_{6, k} x_{6, k}],$ $x_{2, k + 1} = x_{2, k} + h [- k_{1, k} x_{1, k} x_{2, k} + k_{2, k} x_{3, k} + k_{3, k} x_{3, k}],$ $x_{3, k + 1} = x_{3, k} + h [k_{1, k} x_{1, k} x_{2, k} - k_{2, k} x_{3, k} - k_{3, k} x_{3, k}],$ $x_{4, k + 1} = x_{4, k} + h [k_{3, k} x_{3, k} - k_{4, k} x_{4, k} x_{5, k} + k_{5, k} x_{6, k}],$ $x_{5, k + 1} = x_{5, k} + h [- k_{4, k} x_{4, k} x_{5, k} + k_{5, k} x_{6, k} + k_{6, k} x_{6, k}],$ $x_{6, k + 1} = x_{6, k} + h [k_{4, k} x_{4, k} x_{5, k} - k_{5, k} x_{6, k} - k_{6, k} x_{6, k}] .$ The rate constants $k = (k_{1}, \dots, k_{6})$ are taken to be

Fault detection in a sewer system

The following dynamics describe a sewer system with three tanks (Tornil-Sin et al., 2012): $x_{1, k + 1} = x_{1, k} + h [u_{1, k} + u_{2, k} - κ_{1} x_{1, k}],$ $x_{2, k + 1} = x_{2, k} + h [κ_{1} x_{1, k} - κ_{2} \sqrt{x_{2, k}}],$ $x_{3, k + 1} = x_{3, k} + h [κ_{2} \sqrt{x_{2, k}} + u_{3, k} - κ_{3} x_{3, k}],$ $y_{1, k} = x_{2, k} + v_{1, k},$ $y_{2, k} = x_{3, k} + v_{2, k} .$ Above, $x_{i}$ is the water volume in tank $i$ ( $m^{3}$ ), $y_{i}$ is the measurement of $x_{i}$ , $v_{i}$ is the measurement noise, $u_{i}$ is the inlet flowrate of rain ( $m^{3} s^{- 1}$ ), $κ_{i}$ is the outlet valve constant, and $h = 30$ s. We assume that $u_{i} = d_{i} + w_{i}$ , where $d = (1, 2, 1)$ specifies the nominal inflows and $w_{i}$ is a

Conclusions

In this paper, we introduced two new set-based fault detection (FD) algorithms based on the sDTDI and rDTDI set-based state estimators recently developed in Yang and Scott (2018a). We then studied the performance of these methods compared to both existing set-based FD methods and classical data-driven and observer-based FD methods using three detailed examples.

Among the set-based methods, rDTDI consistently provided the most accurate output bounds, followed by sDTDI. As a result, rDTDI also had

CRediT authorship contribution statement

Bowen Mu: Conceptualization, Methodology, Simulation, Writing – original draft. Xuejiao Yang: Conceptualization, Methodology, Simulation. Joseph K. Scott: Conceptualization, Writing – review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This material is based upon work supported by the National Science Foundation, USA under grant number 1949748.

References (44)

AlamoT. et al.
Guaranteed state estimation by zonotopes
Automatica
(2005)
ChoiS.W. et al.
Fault detection and identification of nonlinear processes based on kernel PCA
Chemom. Intell. Lab. Syst.
(2005)
CombastelC.
An extended zonotopic and Gaussian Kalman filter (EZGKF) merging set-membership and stochastic paradigms: Toward non-linear filtering and fault detection
Annu. Rev. Control
(2016)
EfimovD. et al.
Interval state observer for nonlinear time varying systems
Automatica
(2013)
IsermannR.
Model-based fault-detection and diagnosis–status and applications
Annu. Rev. Control
(2005)
JauberthieC. et al.
Fault detection and identification relying on set-membership identifiability
Annu. Rev. Control
(2013)
KuW. et al.
Disturbance detection and isolation by dynamic principal component analysis
Chemometr. Intell. Lab. Syst.
(1995)
LeV.T.H. et al.
Zonotopic guaranteed state estimation for uncertain systems
Automatica
(2013)
PattonR.J. et al.
Observer-based fault detection and isolation: Robustness and applications
Control Eng. Pract.
(1997)
RaïssiT. et al.
Interval observer design for consistency checks of nonlinear continuous-time systems
Automatica
(2010)

RegoB.S. et al.

Set-valued state estimation of nonlinear discrete-time systems with nonlinear invariants based on constrained zonotopes

Automatica

(2021)

ScottJ.K. et al.

Bounds on the reachable sets of nonlinear control systems

Automatica

(2013)

ScottJ.K. et al.

Constrained zonotopes: A new tool for set-based estimation and fault detection

Automatica

(2016)

ShenK. et al.

Rapid and accurate reachability analysis for nonlinear dynamic systems by exploiting model redundancy

Comput. Chem. Eng.

(2017)

Tornil-SinS. et al.

Robust fault detection of non-linear systems using set-membership state estimation based on constraint satisfaction

Eng. Appl. Artif. Intell.

(2012)

TulsyanA. et al.

Reachability-based fault detection method for uncertain chemical flow reactors

IFAC-PapersOnLine

(2016)

VenkatasubramanianV. et al.

A review of process fault detection and diagnosis: Part III: Process history based methods

Comput. Chem. Eng.

(2003)

VenkatasubramanianV. et al.

A review of process fault detection and diagnosis: Part I: Quantitative model-based methods

Comput. Chem. Eng.

(2003)

YangX. et al.

A comparison of zonotope order reduction techniques

Automatica

(2018)

BlesaJ. et al.

Robust fault detection using polytope-based set-membership consistency test

IET Control Theory Appl.

(2012)

BravoJ.M. et al.

Bounded error identification of systems with time-varying parameters

IEEE Trans. Automat. Control

(2006)

ChaiW. et al.

Robust fault detection using set membership estimation and TS fuzzy neural network

Cited by (5)

Set-based fault diagnosis for uncertain nonlinear systems
2024, Computers and Chemical Engineering
Automated fault diagnosis algorithms aim to identify the root cause of a fault after it is detected, which is crucial for determining a safe and effective response. Set-based methods are attractive for this task due to their unique ability to provide formal guarantees that the observed data is inconsistent with candidate fault models. However, such diagnosis methods have scarcely been applied to nonlinear systems because nonlinearity often makes the required inconsistency tests either very conservative or very computationally demanding using existing techniques. This paper proposes a new set-based fault diagnosis algorithm for nonlinear systems and demonstrates improved diagnosis speed and accuracy compared to existing set-based methods. The method tests for inconsistency using a new set-based observer enabled by recent advances in discrete-time reachability analysis.
Performance metric and analytical gain optimality for set-based robust fault detection
2024, Automatica
This paper proposes a new performance metric for set-based robust fault detection on linear time-invariant systems, which quantitatively characterizes all faults that are unguaranteed to be detected by a specific set-based fault detection method at next time instant. Based on this metric, a novel online optimal design method is proposed for the set-theoretic unknown input observer, where the optimal parameters of the observer are obtained by analytically solving an optimization problem with bilinear matrix inequalities. Under a mild assumption, it is demonstrated that the observer is internally stable under the proposed optimal design. Moreover, the proposed online optimal design method is directly related to an algebra Riccati equation as time approaches infinity, and hence an offline optimal design method is further proposed to reduce the computational complexity. Under the proposed optimal designs, it is demonstrated that the observer can detect faults more timely and with much less computational complexity. At the end of this paper, an electric-circuit example and a high-dimensional numerical example are used to illustrate the effectiveness of the proposed methods.
A systematical framework for process operational safety index design and synthesis
2023, IFAC-PapersOnLine
With the demand for improving production efficiency and the application of advanced process control, modern chemical production has increasingly attached importance to process operational safety. A systematical framework for process operational safety index (POSI) design and synthesis is proposed in this work. The notion of general operational safety is defined firstly, and safety critical variables are introduced for quantification. With the consideration of dynamics and robustness, the relationships between safety critical variables and POSI are discussed sufficiently. Based on these relationships and the frameworks of reachable set, flexibility index and control barrier function, the formula forms of POSI are further determined and summarized. Finally, the feed-forward neuron network is utilized as an approximator and classier to improve the continuity and usability of POSI.
Fault Detection via Occupation Kernel Principal Component Analysis
2023, arXiv
Fault Detection via Occupation Kernel Principal Component Analysis
2023, IEEE Control Systems Letters

View full text

Comparison of advanced set-based fault detection methods with classical data-driven and observer-based methods for uncertain nonlinear processes

Highlights

Abstract

Introduction

Section snippets

Problem statement

Conventional fault detection methods

Set-based fault detection

Fault detection in a CSTR

Fault detection in a batch reactor

Fault detection in a sewer system

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgment

Automatica

Chemom. Intell. Lab. Syst.

Annu. Rev. Control

Automatica

Annu. Rev. Control

Annu. Rev. Control

Chemometr. Intell. Lab. Syst.

Automatica

Control Eng. Pract.

Automatica

Automatica

Automatica

Automatica

Comput. Chem. Eng.

Eng. Appl. Artif. Intell.

IFAC-PapersOnLine

Comput. Chem. Eng.

Comput. Chem. Eng.

Automatica

Robust fault detection using polytope-based set-membership consistency test

IET Control Theory Appl.

Bounded error identification of systems with time-varying parameters

IEEE Trans. Automat. Control

Robust fault detection using set membership estimation and TS fuzzy neural network