Comparison of advanced set-based fault detection methods with classical data-driven and observer-based methods for uncertain nonlinear processes

https://doi.org/10.1016/j.compchemeng.2022.107975Get rights and content

Highlights

  • Presents a new set-based fault detection method with improved fault sensitivity.

  • Provides detailed comparison if set-based methods with classical methods.

  • Classical methods detect faults faster but suffer from frequent false alarms.

  • Set-based fault detection methods can guarantee a zero false alarm rate.

  • Set-based methods are more robust to disturbances, model uncertainty, and transients.

Abstract

Automated fault detection (FD) methods are essential for safe and profitable operation of complex engineered systems. Both data-driven and model-based methods have been extensively studied, and some are widely used in practice. However, distinguishing faults from acceptable process variations remains a critical challenge, making both false alarms and missed faults commonplace. In principle, set-based FD methods can rigorously address this challenge. However, existing methods are often much too conservative, particularly for nonlinear systems. Moreover, few if any published comparisons clearly demonstrate the supposed advantages of set-based methods relative to conventional methods. This paper first presents a new set-based FD method based on discrete-time differential inequalities and demonstrates increased fault sensitivity through several case studies. Next, a detailed comparison of set-based methods with representative data-driven and model-based approaches is presented. The results verify some key advantages of the set-based approaches, but also highlight key challenges for future work.

Introduction

Due to the level of complexity, integration, and automation in modern engineered systems, equipment malfunctions and other abnormal events are frequent and unavoidable. These events, termed faults, often have serious economic, safety, and environmental consequences if not detected quickly (Venkatasubramanian et al., 2003b). At the same time, false alarms (i.e., fault declarations during normal operation) caused by benign disturbances can lead to shut-downs or other operational changes that also do significant economic harm. Thus, automated methods for detecting faults quickly and accurately are essential. This paper introduces a new set-based fault detection algorithm based on discrete-time differential inequalities and presents a detailed comparison of set-based fault detection methods with conventional data-driven and model-based approaches.

To date, the theory and practice of fault detection (FD) has been dominated by data-driven approaches (Chiang et al., 2000, Venkatasubramanian et al., 2003b). In these methods, historical data is analyzed to identify important statistics, and the variability of these statistics under normal operating conditions is quantified in terms of thresholds. Online, new measurements are compared with the historical data and a fault is declared if the current statistics violate the computed thresholds. In ideal cases (e.g., stationary Gaussian data), well-established statistical methods such as principal component analysis (PCA) can be used to identify appropriate statistics and set thresholds to achieve any desired rate of false alarms (Chiang et al., 2000). These methods are simple, scalable, and widely used in industry (Joe Qin, 2003). However, for general non-Gaussian data, these simple methods can result in inappropriate statistics or thresholds, often leading to high false alarm rates. Several advanced data-driven methods based on independent component analysis (ICA), dynamic PCA, kernel PCA, and other machine learning techniques have been developed to address this, but these have considerably higher costs and are not well established in practice (Lee et al., 2006, Ku et al., 1995, Choi et al., 2005, Lee et al., 2007). Another significant disadvantage of all data-driven methods is that they require historical data that is appropriate for the current operating point. If the current operating point is different from that of the historical data, either intentionally or due to a persistent disturbance or transient, the process statistics can deviate significantly from historical values, leading to persistent false alarms that render the method unusable (see experiments in Section 5).

To address these limitations, another class of FD methods makes use of process models in place of historical data (Venkatasubramanian et al., 2003b, Isermann, 2005, Patton and Chen, 1997). The most standard approach is to use an observer (i.e., state estimator) to predict the most likely output values at the next sampling time. A fault is then declared if the measured outputs deviate from the predictions by more than a prescribed threshold (Patton and Chen, 1997). This eliminates the need for historical data at the current operating point and naturally handles non-steady operations. In principle, the model also provides a means to completely characterize the output statistics under fault-free conditions, which is crucial for setting thresholds that accurately distinguish faults from disturbances. However, this requires both an accurate model and accurate knowledge of the disturbance probability distributions, which is often impractical. Moreover, even when these distributions are available, propagating them through a nonlinear model to obtain output distributions is extremely difficult. Thus, significant approximations are needed, such as the use of linearization and Gaussian distributions in methods based on extended Kalman filters (Patton and Chen, 1997). In general, this can lead to inaccurate output statistics and, as a result, fault insensitivity or excessive false alarms (see experiments in Section 5).

A third class of FD methods called set-based methods attempts to address the shortcomings of observer-based methods by modeling all disturbances and measurement noises in terms of deterministic bounds rather than probability distributions. Specifically, these inputs are assumed to be bounded within known compact sets, but nothing is assumed about their distributions. Set-based computations are then used to rigorously test if a new measured output is consistent with the process model given these bounds, and a fault is declared if not. This approach is attractive because obtaining bounds on disturbances and measurement noises is often easier than obtaining accurate probability distributions. Moreover, model uncertainty can be easily incorporated using bounded time-invariant parameters, which lessens the need for a highly accurate model. Finally, these methods completely eliminate false alarms provided that the input bounds are valid. However, accurate set-based computations are required to achieve high fault sensitivity, which remains a major challenge.

Many set-based FD methods are available for linear systems using computations with intervals (Seron and De Doná, 2009, Efimov et al., 2013), polytopes (Blesa et al., 2012), ellipsoids (Reppa and Tzes, 2011), zonotopes (Scott et al., 2013, Ingimundarson et al., 2009, Tabatabaeipour et al., 2012), and constrained zonotopes (Scott et al., 2016). However, testing the consistency of a measured output with a nonlinear model is significantly more difficult. One approach is to use set-based parameter estimation, wherein measurements are used to compute an enclosure of the set of consistent model parameters. A fault is then declared when this enclosure has no overlap with a known set of possible parameter values. In Jauberthie et al. (2013), this is done using a differential algebraic approach and interval-based set inversion techniques. However, the computational cost scales exponentially with the number of uncertain parameters. This method is extended to systems with probabilistic noises using a Bayesian framework in Fernández-Cantí et al. (2013), but this does not provide rigorous bounds.

Another approach to set-based FD is to apply set-based state estimation. At each sampling time, a set-based state estimator provides a guaranteed enclosure of the set of states consistent with the model, the bounded uncertainties, and all past measurements. This is then used to compute an enclosure of the possible model outputs, and a fault is declared if the measured output is outside of this set. The key challenge is to compute sufficiently accurate enclosures fast enough for online fault detection. A method for continuous-time systems based on upper and lower Luenberger observers and cooperativity theory is proposed in Raïssi et al. (2010). In Combastel (2016), an Extended Zonotopic and Gaussian Kalman Filter is proposed for discrete-time systems with uncertainties composed of bounded and unbounded parts. However, both methods rely on conservative linearizations of the dynamics over the entire state domain, which can lead to weak enclosures compared to adaptive linearizations such as those in Alamo et al., 2005, Combastel, 2005. The FD method in Rostampour et al. (2017) also uses a set-based state estimator, but forgoes rigorous enclosures in favor of smaller sets based on a prescribed false alarm rate. Thus, this method is not guaranteed to avoid false alarms. Moreover, it requires the solution of nonlinear chance constrained optimization problems at each sampling time, which is prohibitive. To reduce conservatism and increase efficiency, some approaches use approximate models with simpler structure. In Wang and Puig (2016), nonlinear models are linearized before constructing the observer, as in the extended Kalman filter. Similarly, (Chai et al., 2013) approximates nonlinear input–output models using a Takagi–Sugeno fuzzy neural network that is linear in the uncertain parameters. Ellipsoidal (Chai et al., 2013) and zonotopic (Wang and Puig, 2016) enclosures are then computed for the approximate models and used for fault detection. However, these enclosures are not necessarily valid for the original system. Finally, Tulsyan and Barton (2016) proposes a set-based FD method for continuous-time systems using advanced reachable set bounding techniques based on differential inequalities. However, measurements are not used to refine the predicted enclosures as in a true set-based state estimator, which is a serious limitation.

Despite this prior work, performing set-based computations with sufficient accuracy for effective fault detection remains a major challenge for nonlinear systems. Moreover, although the potential advantages of set-based FD methods have been articulated in many prior studies, to the best of our knowledge, no detailed studies comparing set-based FD methods to more conventional data-driven and observer-based methods are available in the literature. In this context, this article makes two main contributions. First, we present a new set-based FD algorithm based on the set-based state estimator recently developed in Yang and Scott (2018a). This estimator computes interval enclosures using the theory of discrete-time differential inequalities (DTDI) and has been shown to produce significantly more accurate enclosures than other state-of-the-art set-based state estimators for nonlinear test cases in Yang and Scott (2018a). However, this method has not previously been applied for FD. To this end, we develop a new FD algorithm using DTDI and demonstrate through case studies that it offers significantly improved fault sensitivity.

Second, we present a detailed comparison of set-based FD methods with more conventional data-driven and observer-based methods using three case studies. Specifically, we compare against the standard PCA method described in Joe Qin (2003) and the extend Kalman filter (EKF) method from Fathi et al. (1993). These methods were chosen because they are representative of the state-of-practice in data-driven and observer-based FD, respectively. We acknowledge that many alternative methods exist and may perform better in specific scenarios. Yet, none are as well established or widely used, and so it seems most informative to first understand how set-based methods compare to these classical benchmarks.

Comparing data-driven, observer-based, and set-based FD methods is challenging and somewhat ill-posed because they are based on fundamentally different assumptions about the process noises and require different information about the process that may not be accurately known in practice. Thus, any comparison necessarily involves applying the methods in cases that violate some of their assumptions. Yet, this is exactly the case when applying these methods to real systems, so it is important to study how they perform in such circumstances. To this end, we develop a framework for comparing data-driven, observer-based, and set-based FD methods using simulated systems with various noise distributions, various types of model uncertainty, various fault-free scenarios representing different operating conditions, and various types of faults. In every case, we make the most sensible approximation of how each method would be applied in practice without knowledge of the actual process and noise distributions used in the simulation.

The remainder of this paper is organized as follows. Section 2 gives a formal problem statement. Section 3 describes the representative data-driven and observer-based FD methods selected for our comparisons. Section 4 introduces the set-based FD paradigm, describes existing methods we compare against, and presents our new set-based FD algorithm. In Sections 5, 6, and 7, detailed comparisons of all FD methods are provided using three case studies. Finally, Section 8 provides further discussion and concluding remarks.

Section snippets

Problem statement

This paper considers FD algorithms for uncertain nonlinear discrete-time systems of the form xk+1=f(k,xk,wk),yk=g(k,xk,vk). Above, xkRnx is the state, ykRny is the output, wkRnw is the disturbance, vkRnv is the measurement noise, and kK{0,,K}. The functions f and g have the form f:K×Rnx×RnwRnx and g:K×Rnx×RnvRny.

Different FD methods make different assumptions about the nature of the noises w and v and the required properties of the functions f and g. The assumptions for the

Conventional fault detection methods

This section describes the basic principles of the conventional FD methods compared in Sections 5–7.

Set-based fault detection

In this section, we present a generic algorithm for set-based FD that makes use of a set-based state estimator. We then describe five distinct methods that arise from applying this algorithm with different state estimators.

In set-based methods, the initial conditions, disturbances, and measurement noises are assumed to be bounded by known compact sets: (x0,wk,vk)C0×W×V,kK.Let y0:K=(y0,,yK) denote an observed output sequence. The goal of set-based state estimation is to characterize the sets

Fault detection in a CSTR

Consider the following model of a continuous stirred tank reactor (CSTR) from (Shen and Scott, 2017): x1,k+1=x1,k+h[u3,kx1,kx2,kk2x1,kx3,k+τ1(u1,k2x1,k)],x2,k+1=x2,k+h[u3,kx1,kx2,k+τ1(u2,k2x2,k)],x3,k+1=x3,k+h[u3,kx1,kx2,kk2x1,kx3,k2τ1x3,k],x4,k+1=x4,k+h[k2x1,kx3,k2τ1x4,k],y1,k=x2,k+v1,k,y2,k=x3,k+v2,k,y3,k=x4,k+v3,k. Above, xi is the concentration (M) of species i, yi is the measurement of xi, ui and vi are disturbances and measurement noises (specified further in each scenario

Fault detection in a batch reactor

The following dynamics describe a six-species enzymatic reaction in a batch reactor, where xi represents the concentration (M) of species i (Scott and Barton, 2013): x1,k+1=x1,k+h[k1,kx1,kx2,k+k2,kx3,k+k6,kx6,k],x2,k+1=x2,k+h[k1,kx1,kx2,k+k2,kx3,k+k3,kx3,k],x3,k+1=x3,k+h[k1,kx1,kx2,kk2,kx3,kk3,kx3,k],x4,k+1=x4,k+h[k3,kx3,kk4,kx4,kx5,k+k5,kx6,k],x5,k+1=x5,k+h[k4,kx4,kx5,k+k5,kx6,k+k6,kx6,k],x6,k+1=x6,k+h[k4,kx4,kx5,kk5,kx6,kk6,kx6,k]. The rate constants k=(k1,,k6) are taken to be

Fault detection in a sewer system

The following dynamics describe a sewer system with three tanks (Tornil-Sin et al., 2012): x1,k+1=x1,k+h[u1,k+u2,kκ1x1,k],x2,k+1=x2,k+h[κ1x1,kκ2x2,k],x3,k+1=x3,k+h[κ2x2,k+u3,kκ3x3,k],y1,k=x2,k+v1,k,y2,k=x3,k+v2,k. Above, xi is the water volume in tank i (m3), yi is the measurement of xi, vi is the measurement noise, ui is the inlet flowrate of rain (m3s1), κi is the outlet valve constant, and h=30 s. We assume that ui=di+wi, where d=(1,2,1) specifies the nominal inflows and wi is a

Conclusions

In this paper, we introduced two new set-based fault detection (FD) algorithms based on the sDTDI and rDTDI set-based state estimators recently developed in Yang and Scott (2018a). We then studied the performance of these methods compared to both existing set-based FD methods and classical data-driven and observer-based FD methods using three detailed examples.

Among the set-based methods, rDTDI consistently provided the most accurate output bounds, followed by sDTDI. As a result, rDTDI also had

CRediT authorship contribution statement

Bowen Mu: Conceptualization, Methodology, Simulation, Writing – original draft. Xuejiao Yang: Conceptualization, Methodology, Simulation. Joseph K. Scott: Conceptualization, Writing – review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This material is based upon work supported by the National Science Foundation, USA under grant number 1949748.

References (44)

  • RegoB.S. et al.

    Set-valued state estimation of nonlinear discrete-time systems with nonlinear invariants based on constrained zonotopes

    Automatica

    (2021)
  • ScottJ.K. et al.

    Bounds on the reachable sets of nonlinear control systems

    Automatica

    (2013)
  • ScottJ.K. et al.

    Constrained zonotopes: A new tool for set-based estimation and fault detection

    Automatica

    (2016)
  • ShenK. et al.

    Rapid and accurate reachability analysis for nonlinear dynamic systems by exploiting model redundancy

    Comput. Chem. Eng.

    (2017)
  • Tornil-SinS. et al.

    Robust fault detection of non-linear systems using set-membership state estimation based on constraint satisfaction

    Eng. Appl. Artif. Intell.

    (2012)
  • TulsyanA. et al.

    Reachability-based fault detection method for uncertain chemical flow reactors

    IFAC-PapersOnLine

    (2016)
  • VenkatasubramanianV. et al.

    A review of process fault detection and diagnosis: Part III: Process history based methods

    Comput. Chem. Eng.

    (2003)
  • VenkatasubramanianV. et al.

    A review of process fault detection and diagnosis: Part I: Quantitative model-based methods

    Comput. Chem. Eng.

    (2003)
  • YangX. et al.

    A comparison of zonotope order reduction techniques

    Automatica

    (2018)
  • BlesaJ. et al.

    Robust fault detection using polytope-based set-membership consistency test

    IET Control Theory Appl.

    (2012)
  • BravoJ.M. et al.

    Bounded error identification of systems with time-varying parameters

    IEEE Trans. Automat. Control

    (2006)
  • ChaiW. et al.

    Robust fault detection using set membership estimation and TS fuzzy neural network

  • Cited by (5)

    • Set-based fault diagnosis for uncertain nonlinear systems

      2024, Computers and Chemical Engineering
    View full text