Identification of Wiener–Hammerstein models based on variational bayesian approach in the presence of process noise

doi:10.1016/j.jfranklin.2021.05.003

Journal of the Franklin Institute

Volume 358, Issue 10, July 2021, Pages 5623-5638

https://doi.org/10.1016/j.jfranklin.2021.05.003 Get rights and content

Abstract

This paper addresses the identification of Wiener–Hammerstein (WH) models in the presence of process and measurement noises, which has not been well studied yet in the existing works. To achieve an unbiased estimation, the model parameters are obtained by maximizing the likelihood function, which is solved in the expectation-maximization framework. Due to the difficulty of computing the posterior distributions of the latent variables of WH models, variational Bayes (VB) is used here, and a method for approximating the posterior distributions based on Monte Carlo integral is proposed in VB framework. To the best of our knowledge, it is the first time to use VB approach for WH model identification. Two simulation examples demonstrate the effectiveness of the proposed method. Moreover, the proposed method is used for a WH benchmark problem, and the results show that it improves the identification performance.

Introduction

Nonlinear system identification is generally recognized as a difficult and challenging problem and has attracted much attention in the last decades [1], [2], [3]. Generally speaking, there does not exist a general identification method for all nonlinear models. As a result, the identification methods for specific nonlinear structures have been proposed successively, such as the nonlinear state space models [4], NARMAX (Nonlinear Autoregressive Moving Average with eXogenous inputs) models [5] and block-oriented nonlinear models [6], [7], [8], [9]. Among those models, block-oriented model has been recognized as useful since it is quite simple and easy to be understood. The representative structure of the block-oriented model is Wiener and Hammerstein structure. In this paper, we consider the identification of Wiener–Hammerstein (WH) model, a relatively complex model for block-oriented models.

While many papers have considered the identification of WH models [10], [11], [12], [13], [14], [15], there still exist some problems needing to be further studied. The structure of the WH system is shown in Fig. 1, where $x_{t}$ is the intermediate, unmeasurable variable and $y_{t}$ the output variable. In most existing works, the authors only considered the identification in the case that $y_{t}$ is disturbed by measurement noise $e_{t}$ . In fact, the intermediate variable $x_{t}$ can also be disturbed by noise $w_{t}$ (called as process noise). However, few works considered this case, which was addressed by Ljung in Hagenblad et al. [11]. If both the process and measurement noises exist, some existing methods will fail to estimate the model parameters. Among the identification methods for WH models, the method based on the prediction error minimization (PEM) is most widely used [10], [16], [17]. However, since the nonlinearity and the existence of the process noise, the estimation of some parameters will be biased, which has been addressed in Hagenblad et al. [11]. An alternative method to identify the parameter is the Maximization Likelihood estimation (MLE), which can provide an unbiased estimation compared with PEM.

MLE is the widely used method for system identification, and it was first used for Wiener system identification in Hagenblad et al. [11], where the likelihood function was computed directly and the parameters were obtained by maximizing the likelihood function. If we straightforwardly compute the likelihood function, a lot of exponential functions and integrals need to be calculated. It is sometimes infeasible to directly obtain the likelihood function. To solve this problem, an alternative method, Expectation-Maximization (EM) algorithm is effective to maximize the likelihood function and was widely used for system identification [18], [19], [20], [21]. In [13], the authors used EM algorithm to identify Wiener models in the absence of process noise. In EM framework, the posterior probabilities of the latent variables must be evaluated. In fact, for WH model in the presence of process noise, it is difficult to obtain the posterior distributions of the latent variables $x_{t}$ . In these similar cases, Variational Bayes [22], [23] provides a method to compute the posterior distributions of the latent variables and was widely used to extend the applicability of the EM algorithm (called as Variational Bayesian EM, VBEM). Since the advantage of VBEM, it has been used recently for system identification [24], [25], [26]. However, most of these works focused on the linear model identification. The application of the VBEM for nonlinear system identification, especially for WH models has not been well studied.

In this paper, we consider the identification of WH models in the presence of process and measurement noises. MLE is used to estimate the model parameters. To obtain the MLE, the VBEM algorithm is used here. We propose an approximate method based on the Monte Carlo integral to optimize the posterior distributions of the intermediate variables with the unknown distribution form. Two numerical examples show the effectiveness and merits of the proposed method. The proposed method is also used for a WH benchmark problem, and the results show that it outperforms the existing methods.

This paper is organized as follows. In Section 2, we describe the identification problem considered in this paper. In Section 3, the framework of the VBEM is described and the proposed VBEM for WH model identification is detailed. The simulation studies and a benchmark WH identification are given in Sections 4 and 5 respectively. Section 6 addresses the conclusions.

Section snippets

Problem statement

The WH model considered in this paper is shown in Fig. 1, which can be described as follows $\begin{matrix} x_{t}^{o} = G (q) u_{t} \\ x_{t} = x_{t}^{o} + w_{t} \\ y_{t} = \sum_{i = 0}^{M} λ_{i} f (x (t - i)) + e_{t} \end{matrix}$ where $u_{t}$ and $y_{t}$ are the input and output signals, $x_{t}$ is the intermediate variable disturbed by process noise $w_{t}$ , and $y_{t}$ is disturbed by the measurement noise $e_{t}$ . $G (q)$ is the input transfer function, which is a causal and stable subsystem and is described by $G (q) = \frac{b_{0} + b_{1} q^{- 1} + \dots + b_{n_{b}} q^{- n_{b}}}{1 + a_{1} q^{- 1} + \dots + a_{n_{a}} q^{- n_{a}}} .$ For simplicity, we use the impulse response ${λ_{0}, λ_{1}, \dots, λ_{M}}$ to represent

VB inference

In the VB framework, the value of the likelihood function is maximized by iteration method. Without loss of generality, at iteration $k$ , the likelihood function of the observed data is given by Bishop [27], $\begin{matrix} \ln (p (y_{1 : N} | Θ^{k})) & = & \int q^{k} (x_{1 : N}) \ln \frac{(p (y_{1 : N}, x_{1 : N} | Θ^{k}))}{q^{k} (x_{1 : N})} d x_{1 : N} \\ + \int q^{k} (x_{1 : N}) \ln \frac{q^{k} (x_{1 : N})}{p (x_{1 : N} | y_{1 : N}, Θ)} d x_{1 : N} \end{matrix}$ where $x_{1 : N} = {x_{1}, x_{2}, \dots, x_{N}}$ denotes the latent variables here, and $q^{k} (x_{1 : N})$ is the posterior joint distribution of the latent variables $x_{1 : N}$ . The derivation of Eq. (4) is given in Appendix A. Let $\begin{matrix} L (q^{k}, Θ^{k}) & = \end{matrix}$

Simulation studies

In this section, we first use a numerical example to demonstrate the effectiveness of the proposed algorithm based on VB approach. The WH model is described as $\begin{matrix} x_{t}^{o} = a_{0} x (t - 1) + u (t - 1) \\ x_{t} = x_{t}^{o} + w_{t} \\ y_{t} = c_{1} x_{t} + c_{2} x_{t}^{2} + e_{t} \end{matrix}$ where $w_{t}$ and $e_{t}$ are independent Gaussian noises with zero means and their variances are denoted by $Q$ and $R$ . This model is taken from [11], and the similar model was considered in other papers related to WH model identification [28].

In this model, five parameters needs to be identified, which are

Benchmark problem

In this section, we used the proposed method to identify a WH benchmark problem, which was given in Schoukens et al. [29]. In line with the model structure shown in Fig. 1, $G (q)$ of this benchmark model is a Chebyshev filter, and $H (q)$ is an inverse Chebyshev filter. The static nonlinearity of this benchmark model is built by a diode circuit.

We use the following model to describe this benchmark process, $\begin{matrix} x_{t}^{o} = \frac{\sum_{i = 0}^{n_{b}} b_{i} q^{- n_{b}}}{1 + \sum_{i = 1}^{n_{a}} a_{i} q^{- n_{a}}} u_{t} \\ x_{t} = x_{t}^{o} + w_{t} \\ f (x_{t}) = c_{0} + c_{1} x_{t} + c_{2} x_{t}^{2} \\ y_{t} = \sum_{i = 0}^{M} λ_{i} f (x (t - i)) + v_{t} \end{matrix}$ where $w_{t} \sim N (0,$

Conclusion

In this paper, a new method for WH model identification is proposed using variational Bayesian approach. The merit of this method is that it provides an unbiased parameter estimation in the presence of process and measurement noises by maximizing the likelihood function. To achieve this, two contributions are made here. The first one is that we proposed a framework to identify the model based on EM algorithm. This can avoid straightforwardly computing the likelihood function, which is

Declaration of Competing Interest

None.

Acknowledgments

This work is supported by the National Key Research and Development Project (2019YFB2006603), the National Natural Science Foundation of China (61903051, 61773080), the Fundamental Research Funds for the Central Universities (2019CDXYZDH0014, 2019CDYGZD001).