1 Introduction

Target tracking plays an important role in a variety of practical applications, such as underwater sonar tracking [1], aircraft surveillance [2], and visual tracking [3, 4]. The objective is to accurately estimate the target state for a sequence of observation sets in presence of noise. When the systems are linear and noise are Gaussian, Kalman filter provides an optimal filtering technique to estimate the target state [5]. Its variant have been studied under numerous relaxed assumptions [69]. Of course, this limitation in these filters is that it assumes a complete prior knowledge of the dynamic and measurement model parameters, including the noise statistics. In many practical situations, model uncertainty is caused by unknown dynamic model parameters, and noise uncertainty is caused by unknown noise statistics; both violate the abovementioned assumption.

The model uncertainty is due to the fact that the target dynamics cannot be properly modeled by a single state space model (SSM) [10]. For example, in the maneuvering target tracking, the target has different maneuvering behaviors, such as constant velocity, acceleration, and turning with different angular velocity. Thus, the interacting multiple models (IMM) algorithm was proposed in [11], where the multiple models are used to match the different maneuvering behaviors and the transition among different models is subject to a Markov process. Since the state estimation of each model is parallel and independent, the computational complexity of the algorithm increases gradually as the number of models increases and the conflict among the models are more significant. In order to deal with the problem, Li and BarShalom [12] proposed the variable structure multiple model methods (VSMM), which adjusts their model sets in real time. In VSMM, the core idea is the model set adaptation (MSA) that aims at finding the best model set for the state estimation. Different implementations for MSA approaches have been proposed. For instance, the expected mode augmentation (EMA) [13], the likely model set (LMS) [14], the minimal sub model set (MSS) [15], and the best model augmentation (BMA) [16].

In addition, the noise uncertainty also affects the performance of the tracking system. Practically, the measurement noise usually varies with interferences. To cope with the noise uncertainty, the adaptive filtering method is used to address the issue of state estimation in the case of unknown noise statistics in [17]. The adaptive filtering algorithms are classified into four categories: maximum likelihood, correlation, covariance matching, and Bayesian method. Among them, the Bayesian method can be seen as a more general case of the other three algorithms. However, most of the Bayesian algorithms are difficult to get the analytical solution—because of the complexity of the probability density function and the high dimensional integral. These adaptive filters considered that the sequence of state variable follows a first-order Markov process. For the stationary random sequence (e.g., image and video signal [18]), the denoising technique based on the intersection of confidence intervals (ICI) rule was presented to provide a noise-free image or its best possible estimate [19, 20].

Recently, the variational Bayesian approximation adaptive Kalman filter (VBAKF) proposed in [17] has been introduced to estimate the target state under the case without knowledge of measurement noise variances. Its main idea is that the joint posterior of the target state and measurement noise variances can be approximated by a factored free-form distribution (for models in the conjugate-exponential class). Unfortunately, linear system and Gaussian distribution assumption do not really exist in actual applications. The VBAKF cannot achieve the demanding filtering performance. The nonlinear estimation methods, such as unscented Kalman filter (UKF) and cubature Kalman filter (CKF), were combined with VB approximation method [21, 22]. Here, the measurement noise variances are approximated by the variational Bayesian approximation (VB) approach; thereafter, system states are updated by these nonlinear estimation methods. Hu [23] proposed the robust version of VBAKF, which models the measurement noise by using the Student t distribution, and [23] was extended in [24] and [25].

However, the abovementioned algorithms considered only one of these uncertainties. In real environment, the model and noise uncertainties have to be considered simultaneously. Several suggestions for dealing with this problem can be found in literature. In [26], a novel estimator was presented for the jump Markov linear systems with unknown measurement noise variance parameters. A merging scheme is adopted for the system noises in the fusion stage of the IMM approach and a fix recursive form is used to estimate the noise variance parameters. Based on the literature [26], Hong [27] presented a robust variational Bayesian-interacting multiple model (IMM-VB), which models the glint noise by using Gaussian mixture distribution. Gao [28] proposed an interacting multiple model estimation-based adaptive robust UKF, which establishes an adaptive fading UKF for the case of process model uncertainty and a robust UKF for the case of measurement model uncertainty. These approaches have obtained better performance for the problem of the absence of model and noise uncertainties, but they have a very high burden of time complexity. That is because more models have been designed in IMM algorithm for demanding results.

In this paper, we present a model set adaptive filtering algorithm based on variational Bayesian approximation to address the state estimation problem under the situation with dual uncertainties. The Rényi information divergence, as a criterion, is used to computing the divergence between the true mode and the candidate models. Subsequently, it develops a model-conditioned estimation based on variational Bayesian approximation to fuse the state and measurement noise variances. Two simulation experiments are provided to illustrate the effectiveness of the proposed algorithm.

The rest of paper is organized as follows. In Section 2, the state estimation problem with unknown noise statistics is formulated and the variational Bayesian method is briefly reviewed. The proposed algorithm is described in Section 3. In Section 4, the simulation results are presented to prove the effectiveness of the proposed algorithms. Finally, the conclusions are given in Section 5.

2 Methods/experimental

The main drawbacks of the interacting multiple model method for target tracking system are the high computational complexity and poor performance. A model set adaptive filtering algorithm based on variational Bayesian approximation is proposed in this paper. The proposed method is designed based on the idea of the VSMM and VB methods. The Rényi information divergence measures the “closeness” of two probability density functions. It has additional flexibility in that in allows for emphasis to be placed on specific portions of the support of the densities to be compared. Hence, the Rényi information divergence is as a criterion to choose the best match model that has the minimum divergence between candidate models and true mode. And the moments matching technique is used to obtain the mixed statistics of measurement noise and system state at the fusion stage.

The paper performs Monte Carlo simulation using MATLAB software to examine the behavior of the proposed method. The root mean square errors (RMSEs) are used to evaluate the estimation accuracy. We compare our method with IMM-VB and IMM in two different scenarios. The parameters in the experiments are introduced in Section 5.

This paper does not contain any studies with human participants or animals performed by any of the authors.

3 Variational Bayesian approximation

3.1 Problem formulation

Consider the following state space model:

$$ \left\{ \begin{aligned} x_{k} &=F_{k}^{r_{k}}x_{k-1}+w_{k-1}^{r_{k}} \\ z_{k} &=H_{k}^{r_{k}}x_{k}+v_{k} \end{aligned} \right. $$
(1)

where xkRn and zkRd are the target state and the measurement vectors, respectively. rk denotes the system mode which is described by a discrete-time homogenous Markov chain. The process noise \(w_{k-1}^{r_{k}}\) corresponding to mode rk and the measurement noise vk are assumed to be mutually independent zero-mean Gaussian random processes with the covariance matrices \(Q_{k-1}^{r_{k}}\) and Σk, respectively. Here, we denote the diagonal covariance matrix comprising of these variances by \(\Sigma _{k}=\text {diag}\{\sigma _{k,1}^{2},\sigma _{k,2}^{2}\ldots \sigma _{k,d}^{2}\}\). Due to the fact that the inverse Wishart distribution is the conjugate prior distribution for the variance of the Gaussian distribution [29]. For this reason, a product of inverse Wishart models is adopted to approximate the posterior distribution Σk. That is

$$ p(\Sigma_{k})=IW(\Sigma_{k};{\kappa}_{k},{\Lambda}_{k}) $$
(2)

where the notation IW(Σk;κk,Λk) represents the inverse Wishart distribution for the variable Σk with the degree of freedom κk, and the symmetric positive definite matrix Λk.

Remark 1

Since the inverse gamma distribution is a special case of the inverse wishart distribution in one-dimensional space, the discussed model in this paper is much more general and thus more information can be utilized for filter design. Interested readers are referred to [30] and [31] for a detailed introduction.

However, Σk is unknown in most cases, which requires to joint estimate the posterior distribution of the target state and the measurement noise covariance. Assume that the dynamic model of the state and the covariance matrix are independent for any mode rk, that is

$$ p(x_{k},\Sigma_{k}|x_{k-1},\Sigma_{k-1},r_{k})=p(x_{k}|x_{k-1},r_{k})p(\Sigma_{k}|\Sigma_{k-1},r_{k}) $$
(3)

The predicted joint distribution of the state and measurement noise are calculated by the Chapman-Kolmogorov equation.

$$ {\begin{aligned} p(x_{k},\Sigma_{k}|r_{k},Z_{k-1})&= \int p(x_{k}|x_{k-1},r_{k})p(\Sigma_{k}|\Sigma_{k-1},r_{k})\\ &\quad \times p(x_{k-1},\Sigma_{k-1}|r_{k},Z_{k-1})\,dx_{k-1}\,d\Sigma_{k-1} \end{aligned}} $$
(4)

When the measurement zk is available, the joint posterior distribution is given by the Bays rule.

$$ {\begin{aligned} p(x_{k},\Sigma_{k}|r_{k},Z_{k})=\frac{p(Z_{k}|x_{k},\Sigma_{k},r_{k},Z_{k-1})p(x_{k},\Sigma_{k}|r_{k},Z_{k-1})}{\int p(Z_{k}|x_{k},\Sigma_{k},r_{k})p(x_{k},\Sigma_{k}|r_{k})\,dx_{k-1}\,d\Sigma_{k-1}} \end{aligned}} $$
(5)

where p(Zk|xk,Σk,rk,Zk−1) denotes the likelihood function which is related with Σk.

Notice that the two main problems need to be solved. One is the dynamic model of the measurement noise covariance p(Σk|Σk−1,rk) is unknown. The other is that the posterior density is difficult to achieve due to the involved intractable integrals. To calculate the posterior density with the unknown noise covariance, the variational Bayesian approximation method, which uses a simple free-from distribution to approximate the joint posterior density, is proposed.

3.2 Variational Bayesian approximation

Assume that the state vector and measurement noise covariance are independent, and the joint posterior density can be approximated by a free-form factored distribution as follows

$$ p(x_{k},\Sigma_{k}|r_{k},Z_{k})\approx Q_{x}(x_{k})Q_{\Sigma}(\Sigma_{k}) $$
(6)

where the probability densities Qx(xk) and QΣ(Σk) are Gaussian distribution and inverse wishart distribution, respectively. The non-negative KL divergence represents the measure of the dissimilarity of the approximation and the true posterior, that can be expressed as

$$ \begin{aligned} &KL\left[Q_{x}(x_{k})Q_{\Sigma}(\Sigma_{k})\|p(x_{k},\Sigma_{k}|r_{k},Z_{k})\right]\\ &=\int Q_{x}(x_{k})Q_{\Sigma}(\Sigma_{k})\log \frac{{Q_{x}(x_{k})Q_{\Sigma}(\Sigma_{k})}}{{p(x_{k},\Sigma_{k}|r_{k},Z_{k})}}\,dx_{k}\,d\Sigma_{k} \end{aligned} $$
(7)

The optimal approximation of the joint posterior density can be obtained by minimizing the KL divergence, and the mean field approximation is used to solve the calculation problem of multiple hidden variables [29]. The results are given as

$$ Q_{x}(x_{k})=N(x_{k};{\hat{x}_{k}},P_{k}) $$
(8)
$$ Q_{\Sigma}(\Sigma_{k})=\prod\limits_{u = 1}^{d} {IW\left(\sigma_{k - 1,u}^{2};{\kappa}_{k - 1,u}^{i},{\Lambda}_{k - 1,u}^{i}\right)} $$
(9)

4 Model set adaptive diltering algorithm based on variational Bayesian approximation

In this section, a model set adaptive filtering algorithm is proposed. The model set adaptive approach is used to select the best model set for multiple model estimation. Moreover, the noise statistics and state of each model are estimated by the model set conditioned estimation based on variational Bayesian approximation.

4.1 Model-set adaptation

To address the problem of the model uncertainty, IMM-VB needs a lot of models to improve the algorithm performance. This has two obvious defects [13]. First, the computation load grows with the increase of the number of the models. Second, the competition among these models lead to performance decrease greatly.

To overcome this problem, a MSA algorithm based on Rényi divergence is proposed. Rényi information divergence is a distance measure between two densities (the test density f and the reference density f0) [32]. The order α Rényi information divergence of f and f0 is defined as

$$ D_{\alpha}(f\|f_{0})=-\frac{1}{1-\alpha}\ln \int f^{\alpha}(x)f_{0}^{1-\alpha}(x)\,dx $$
(10)

For any order α, the Rényi information divergence takes on its minimum value if and only f=f0. In our application, we wish to compute the divergence between the true mode sk and candidate model \(r_{k}^{j} \in \mathcal {M}^{c}\) at the time k. That is

$$ \begin{aligned} {D_{z}}\left({s_{k}},r_{k}^{j}\right)& \overset{\Delta}{=} D\left(p\left({z_{k}}|{s_{k}}\right)\|p\left({z_{k}}|r_{k}^{j}\right)\right) \\ &= -\frac{1}{1-\alpha}\ln \int p^{\alpha}({z_{k}}|{s_{k}})p^{1-\alpha}\left({z_{k}}|r_{k}^{j}\right)\,d{z_{k}} \end{aligned} $$
(11)

where p(zk|sk) and \(p\left ({z_{k}}|r_{k}^{j}\right)\) are the probability density functions of zk conditioned on sk and \(r_{k}^{j}\), respectively. Due to the true mode sk of the system is unknown at the time k [16], it is assumed that the best online estimates of the probability density function of zk for the true mode sk can be approximated as \(p({z_{k}}|{\mathcal {M}_{k - 1}},{s_{k}},{\Sigma _{{s_{k}}}},{Z_{k - 1}})\approx p({z_{k}}|{\mathcal {M}_{k - 1}},{\mathcal {M}_{k}},{\Sigma _{{k}}},{Z_{k - 1}})\).

Let \(p({z_{k}}|{s_{k}})=\mathcal {N}\left ({z_{k}};{\bar z_{{s_{k}}}},{\Phi _{{s_{k}}}}\right)\) and \(p\left ({z_{k}}|r_{k}^{j}\right)=\mathcal {N}\left ({z_{k}};\bar z_{k}^{j},\Phi _{k}^{j}\right)\) be the Gaussian densities with vector means \(\bar z_{s_{k}}, \bar z_{k}^{j}\) and positive definite covariance matrices \(\Phi _{s_{k}}, \Phi _{k}^{j}\). The Rényi information divergence between p(zk|sk) and \(p\left ({z_{k}}|r_{k}^{j}\right)\) is

$$ {\begin{aligned} {D_{z}}\left({s_{k}},r_{k}^{j}\right)&= -\frac{1/2}{1-\alpha}\ln \frac{\left|\Phi_{k}^{j}\right|^{\alpha}\left|\Phi_{s_{k}}\right|^{1-\alpha}}{\left|\alpha\Phi_{k}^{j}+(1-\alpha)\Phi_{s_{k}}\right|}\\ &\quad+\frac{\alpha}{2}\triangle^{T}\left(\alpha\Phi_{k}^{j}+(1-\alpha)\Phi_{s_{k}}\right)^{-1}\triangle \end{aligned}} $$
(12)

where \(\triangle =\bar z_{s_{k}}-\bar z_{k}^{j}\). These means and covariance matrices of the Gaussian densities can be calculated as

$$ \left\{ \begin{aligned} &\ {\bar z}_{{s_{k}}} = E\left[{z_{k}}|{\mathcal{M}_{k - 1}},{\mathcal{M}_{k}},{Z_{k - 1}}\right] \\ &{\Phi_{{s_{k}}}} = E\left[\left({z_{k}} - {\bar z}_{{s_{k}}}\right){{\left({z_{k}} - {{\bar z}_{{s_{k}}}}\right)}^{T}}|{\mathcal{M}_{k - 1}},{\mathcal{M}_{k}},Z_{k - 1}\right] \\ & \bar z_{k}^{j} = E\left[{z_{k}}|r_{k}^{j},x_{k|k - 1}^{j},P_{k - 1}^{j}\right] \\ & \Phi_{k}^{j} = E\left[\left({z_{k}} - \bar z_{k}^{j}\right){{\left({z_{k}} - \bar z_{k}^{j}\right)}^{T}}|r_{k}^{j},x_{k|k - 1}^{j},P_{k - 1}^{j}\right] \end{aligned} \right. $$
(13)

Remark 2

Different selections of the parameter α allow for different parts of these distributions to be emphasized. In the limiting case of α→1 the Rényi information divergence becomes the Kullback-Liebler divergence. The effect of the order to the Rényi information divergence is detailed and analyzed in [3335]. The results show that α=0.5 emphasizes the tails of the distribution and allows for the maximum discrimination between two similar distributions. Therefore, we can obtain a better performance by choosing α=0.5 for tracking applications [36,37].

The optimal model \(\hat {r}_{k}\) in the candidate model set \(\mathcal {M}^{c}\) can be selected as the one with the minimum Rényi information divergence.

$$ {\hat{r}_{k}} = \arg \underset{r_{k}^{j} \in \mathcal{M}^{c}}{\min} {D_{z}}\left({s_{k}},r_{k}^{j}\right) $$
(14)

Thus, the adapted model set is the basic model set \(\mathcal {M}^{b}\) combine with the model \(\hat {r}_{k}\) at the time k.

$$ {\mathcal{M}_{k}} = {\mathcal{M}^{b}} \bigcup {\hat{r}_{k}} $$
(15)

4.2 Model-set conditioned estimation based variational Bayesian approximation

Suppose that the posterior probability density function of model i at the time k-1 is described as below

$$ {\begin{aligned} p\left({x_{k-1}},{\Sigma_{k - 1}}|r_{k - 1}^{i}\right) &= \mathcal{N}\left({x_{k - 1}};\hat{x}_{k - 1|k - 1}^{i},P_{k - 1|k - 1}^{i}\right) \\ &\quad\times \prod\limits_{u = 1}^{d} {IW\left(\sigma_{k - 1,u}^{2};{\kappa}_{k - 1,u}^{i},{\Lambda}_{k - 1,u}^{i}\right)} \end{aligned}} $$
(16)

Note that Eq. (16) can be seen as a product of a Gaussian distribution and an inverse Wishart distribution. \(r_{k - 1}^{i}\) means the event that model i matches the system model in effect at time \(k-1, r_{k - 1}^{i} \in {\mathcal {M}_{k - 1}}\). The notation \( \mathcal {N}\left ({x_{k - 1}};\hat {x}_{k - 1|k - 1}^{i},P_{k - 1|k - 1}^{i}\right)\) represents the Gaussian probability density function with mean \(\hat {x}_{k - 1|k - 1}^{i}\) and covariance \(P_{k - 1|k - 1}^{i}\). The notation \(IW\left (\sigma _{k - 1,u}^{2};{\kappa }_{k - 1,u}^{i},{\Lambda }_{k - 1,u}^{i}\right)\) represents an inverse Wishart distribution with parameters \({\kappa }_{k - 1,u}^{i}\) and \({\Lambda }_{k - 1,u}^{i}\). By using the total probability theorem, one has

$$ p({x_{k}},{\Sigma_{k}}|{Z_{k}}) = \sum\limits_{j = 1}^{{\mathcal{M}_{k-1}}} {p\left({x_{k}},{\Sigma_{k}}|r_{k}^{j},{Z_{k}}\right)p\left(r_{k}^{j}|{Z_{k}}\right)}\\ $$
(17)

In the model set conditional reinitialization stage, the joint probability density function is described as

$$ \begin{aligned} &p\left({x_{k - 1}},{\Sigma_{k - 1}}|r_{k}^{j},{Z_{k - 1}}\right) \\ &= \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}} {p\left({x_{k - 1}},{\Sigma_{k - 1}}|r_{k - 1}^{i},{Z_{k - 1}}\right)p\left(r_{k - 1}^{i}|r_{k}^{j},{\mathcal{M}_{k - 1}}\right)} \\ &= \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}} {\mu_{k - 1}^{i|j}\mathcal{N}\left({x_{k - 1}};\hat{x}_{k - 1|k - 1}^{i},P_{k - 1|k - 1}^{i}\right)}\\ &\quad{\prod\limits_{u = 1}^{d} {IW\left(\sigma_{k - 1,u}^{2};{\kappa}_{k - 1,u}^{i},{\Lambda}_{k - 1,u}^{i}\right)} } \end{aligned} $$
(18)

where \(\mu _{k - 1}^{i|j}\) denotes the conditional probability that model j transfers to model i at the time k. It is calculated by

$$ \mu_{k - 1}^{i|j} = \frac{{p\left(r_{k}^{j}|r_{k - 1}^{i}\right)p\left(r_{k - 1}^{i}|{Z_{k - 1}}\right)}}{{p\left(r_{k}^{j}|{Z_{k - 1}}\right)}} = \frac{{{\pi_{ij}}\mu_{k - 1}^{i}}}{{\hat{\mu}_{k|k - 1}^{j}}} $$
(19)

where \(\hat {\mu }_{k|k- 1}^{j}\) is the normalization coefficient. Based on the model set \(\mathcal {M}_{k}\), the output of each filter is merged in the fusion stage [13]. Therefore, we aim to approximate the sum term in (18) by a single one, that is

$$ {\begin{aligned} p\left({x_{k - 1}},{\Sigma_{k - 1}}|r_{k}^{j}\right) &= \mathcal{N}\left({x_{k - 1}};\hat{x}_{k - 1|k - 1}^{0j},P_{k - 1|k - 1}^{0j}\right)\\ &\quad \times \prod\limits_{u = 1}^{d} {IW\left(\sigma_{k - 1,u}^{0j};{\kappa}_{k - 1,u}^{0j},{\Lambda}_{k - 1,u}^{0j}\right)} \end{aligned}} $$
(20)

where \(\hat {x}_{k - 1|k - 1}^{0j}\) and \(P_{k - 1|k - 1}^{0j}\) are mixed state and covariance matrix of the model j, respectively.

$$ \hat{x}_{k - 1|k - 1}^{0j} = \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}} {\mu_{k - 1}^{i|j}\hat{x}_{k - 1|k - 1}^{i}} \\ $$
(21)
$$ {\begin{aligned} P_{k - 1|k - 1}^{0j}&= \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}}{{\mu_{k - 1}^{i|j}}P_{k - 1|k - 1}^{i}}\\ &\quad + \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}} {{\mu_{k - 1}^{i|j}}\left(\hat{x}_{k - 1|k - 1}^{i} - \hat{x}_{k - 1|k - 1}^{0j}\right){{(\cdot)}^{T}}} \end{aligned}} $$
(22)

On the basis of the moment matching theory [26], the first and second moments of the variable \(\sigma _{k - 1,u}^{0j}\) in (20) can be obtained as follows

$$ E\left[\sigma_{k - 1,u}^{0j}\right] = \frac{{{\Lambda}_{k - 1,u}^{0j}}}{{{\kappa}_{k - 1,u}^{0j} - d - 1}} $$
(23)
$$ Var\left[\sigma_{k - 1,u}^{0j}\right] = \frac{{2{{\left({\Lambda}_{k - 1,u}^{0j}\right)}^{2}}}}{{{{\left({\kappa}_{k - 1,u}^{0j} - d - 1\right)}^{2}}\left({\kappa}_{k - 1,u}^{0j} - d - 3\right)}} $$
(24)

The mean and variance of the inverse Wishart sum distribution in (18) are given by

$$ E\left[\sigma_{k - 1,u}^{i}\right] = \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}} {\mu_{k - 1}^{i|j}\frac{{{\Lambda}_{k - 1,u}^{i}}}{{{\kappa}_{k - 1,u}^{i} - d - 1}}} $$
(25)
$$ {\begin{aligned} Var\left[\sigma_{k - 1,u}^{i}\right] &= \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}} {\mu_{k - 1}^{i|j}} {\frac{{2{{\left({\Lambda}_{k - 1,u}^{i}\right)}^{2}}}}{{{{\left({\kappa}_{k - 1,u}^{i} - d - 1\right)}^{2}}\left({\kappa}_{k - 1,u}^{i} - d - 3\right)}}}\\ &\quad + \sum\limits_{i = 1}^{{\mathcal{M}_{k - 1}}} {\mu_{k - 1}^{i|j}}{{{{\frac{{{\Lambda}_{k - 1,u}^{i}}}{{{\kappa}_{k - 1,u}^{i} - d - 1}} - E\left[\sigma_{k - 1,u}^{i}\right]^{2}}}}} \end{aligned}} $$
(26)

By solving Eqs. (23) and (24), the parameters \({\kappa }_{k - 1,u}^{0j}\) and \({\Lambda }_{k - 1,u}^{0j}\) are calculated as follows

$$ {\kappa}_{k - 1,u}^{0j} = \frac{{2{{\left(E\left[\sigma_{k - 1,u}^{i}\right]\right)}^{2}}}}{{Var\left[\sigma_{k - 1,u}^{i}\right]}} + d + 3 $$
(27)
$$ {\Lambda}_{k - 1,u}^{0j} = \left({\frac{{2{{\left(E\left[\sigma_{k - 1,u}^{i}\right]\right)}^{2}}}}{{Var\left[\sigma_{k - 1,u}^{i}\right]}} + 2} \right)E\left[\sigma_{k - 1,u}^{i}\right] $$
(28)

Remark 3

In the proposed algorithm, the key feature is that the Gaussian sum distribution for the estimated state is approximated by a single Gaussian distribution. Similarly, the inverse Wishart sum distribution for the measurement noise covariance matrix Σk is approximated by a single inverse Wishart distribution by matching the first and second moments.

Then, the mixed target state and measurement noise covariance are taken as the filter input. The predict density is computed from Eq. (3).

$$ {\begin{aligned} &p\left({x_{k}},{\Sigma_{k}}|r_{k}^{j},{Z_{k - 1}}\right)\\ &= \int {p\left({x_{k}}|{x_{k - 1}},r_{k}^{j}\right)p\left({\Sigma_{k}}|{\Sigma_{k - 1}},r_{k}^{j}\right)}p\left({x_{k - 1}},{\Sigma_{k - 1}}|r_{k}^{j},{Z_{k - 1}}\right)d{x_{k - 1}}d{\Sigma_{k - 1}} \\ &= \mathcal{N}\left({x_{k}};\hat{x}_{k|k - 1}^{j},P_{k|k - 1}^{j}\right)\prod\limits_{u = 1}^{d} {IW\left(\sigma_{k - 1,u}^{2};{\kappa}_{k - 1,u}^{j - },{\Lambda}_{k - 1,u}^{j - }\right)} \end{aligned}} $$
(29)

where, taking account into the time variation of the parameters, a forgetting factor ρ is introduced, ρ∈(0,1]. Note that the closer ρ is to 0, the more instability the parameters will be in terms of time-fluctuations [38]. The system state, covariance matrix, and the parameters of the inverse wishart distribution are predicted by

$$\begin{array}{*{20}l} \hat{x}_{k|k - 1}^{j} &= F_{k - 1}^{j}x_{k - 1|k - 1}^{0j} \end{array} $$
(30)
$$\begin{array}{*{20}l} P_{k|k - 1}^{j} &= F_{k - 1}^{j}P_{k - 1|k - 1}^{0j}(F_{k - 1}^{j})^{T} + Q_{k - 1}^{j} \end{array} $$
(31)
$$\begin{array}{*{20}l} {\kappa}_{k - 1,u}^{j -} &= {\rho}{\kappa}_{k - 1,u}^{0j} \end{array} $$
(32)
$$\begin{array}{*{20}l} {\Lambda}_{k - 1,u}^{j -} &= {\rho}{\Lambda}_{k - 1,u}^{0j} \end{array} $$
(33)

After receiving the measurement zk, VB approximation method can be used to obtain a free-form factored approximate distribution for \(p\left ({x_{k}},{\Sigma _{k}}|r_{k}^{j},{Z_{k}}\right)\) [38], the analytical expression of the joint posterior probability density function is given by

$$ {\begin{aligned} &p\left({x_{k}},{\Sigma_{k}}|r_{k}^{j},{Z_{k}}\right) \propto p\left({z_{k}}|{x_{k}},{\Sigma_{k}},r_{k}^{j}\right)p\left({x_{k - 1}},{\Sigma_{k - 1}}|r_{k}^{j},{Z_{k}}\right) \\ &= \mathcal{N}\left({z_{k}};H_{k}^{j}x_{k|k - 1}^{j},{\Sigma_{k}}\right)\mathcal{N}\left(x_{k}^{j};\hat{x}_{k|k - 1}^{j},P_{k|k - 1}^{j}\right)\\ &\quad \prod\limits_{u = 1}^{d}{IW\left(\sigma_{k - 1,u}^{2};{\kappa}_{k - 1,u}^{j - },{\Lambda}_{k - 1,u}^{j - }\right)} \end{aligned}} $$
(34)

To calculate the distributions of state and measurement noise covariance of the ith model, VB assumes that the joint posterior distribution in (34) can be factorized as the product of \(q\left (x_{k}^{j}\right)\) and \(q\left (\Sigma _{k}^{j}\right)\). The logarithm of \(q\left (x_{k}^{j}\right)\) can be computed by fixing the \(q\left (\Sigma _{k}^{j}\right)\).

$$ {\begin{aligned} \log \left(q\left(x_{k}^{j}\right)\right) &\propto - \frac{1}{2}{\left({z_{k}} - H_{k}^{j}\hat{x}_{k|k - 1}^{j}\right)^{T}}{\left(\Sigma_{k}^{j}\right)^{- 1}}\left({z_{k}} - H_{k}^{j}\hat{x}_{k|k - 1}^{j}\right) \\ &- \frac{1}{2}{\left(x_{k}^{j} - \hat{x}_{k|k - 1}^{j}\right)^{T}}{\left(P_{k|k - 1}^{j}\right)^{- 1}}\left(x_{k}^{j} - \hat{x}_{k|k - 1}^{j}\right) \end{aligned}} $$
(35)

Through the simplified formula, \(p\left ({x_{k - 1}},{\Sigma _{k - 1}}|r_{k}^{j},{Z_{k}}\right)\) is approximated by

$$ q\left(x_{k}^{j}\right)=\mathcal{N}\left(x_{k}^{j};\hat{x}_{k|k - 1}^{j},P_{k}^{j}\right) $$
(36)

The mean and covariance of the Gaussian distribution are derived by the VB approximation as follows

$$ {\begin{aligned} \hat{x}_{k|k}^{j} &= \hat{x}_{k|k - 1}^{j}\\ &\quad + P_{k|k - 1}^{j}H_{k}^{j}{\left(\Sigma_{k}^{j} + {\left(H_{k}^{j}\right)^{T}}{\left(P_{k|k - 1}^{j}\right)^{- 1}}H_{k}^{j}\right)^{- 1}}\left({z_{k}} - H_{k}^{j} \hat{x}_{k|k - 1}^{j}\right) \end{aligned}} $$
(37)
$$ P_{k}^{j} = {\left({\left(P_{k|k - 1}^{j}\right)^{- 1}} + {\left(H_{k}^{j}\right)^{T}}{\left(\Sigma_{k}^{j}\right)^{- 1}}H_{k}^{j}\right)^{- 1}} $$
(38)

Similarly, the logarithm of \(q\left (\Sigma _{k}^{j}\right)\) can be calculated by keeping \(q\left (x_{k}^{j}\right)\) fixed such as

$$ {\begin{aligned} \log \left(q\left(\Sigma_{k}^{j}\right)\right) &\propto - \frac{{\kappa + d + 2}}{2}\log \left(\Sigma_{k}^{j}\right) + {\left(H_{k}^{j}\right)^{T}}P_{k}^{j}H_{k}^{j}\Sigma_{k}^{- 1} \\ &- \frac{1}{2}tr\left({\Lambda}_{k}^{j} + {\left({z_{k}} - H_{k}^{j} \hat{x}_{k|k - 1}^{j}\right)^{T}}\left({z_{k}} - H_{k}^{j} \hat{x}_{k|k - 1}^{j}\right)\right) \end{aligned}} $$
(39)

Here, \(q\left (\Sigma _{k}^{j}\right)\) is approximated as

$$ q\left(\Sigma_{k}^{j}\right)=IW\left(\sigma_{k,u}^{2};{\kappa}_{k,u}^{j},{\Lambda}_{k,u}^{j}\right) $$
(40)

where

$$ \begin{aligned} {\kappa}_{k,u}^{j} = {\kappa}_{k - 1,u}^{j -} + \frac{d}{2} \end{aligned} $$
(41)
$$ \begin{aligned} {\Lambda}_{k,u}^{j} &= {\Lambda}_{k - 1,u}^{j -} + {\left({z_{k}} - H_{k}^{j} \hat{x}_{k|k}^{j}\right)_{u}^{2}} + \left({\left(H_{k}^{j}\right)^{T}}P_{k}^{j}H_{k}^{j}\right)_{uu} \end{aligned} $$
(42)

where u=1,2…d, and

$$ \hat{\Sigma}_{k}^{j}=\text{diag}\left(\frac{{\Lambda}_{k,1}^{j}}{{\kappa}_{k,1}^{j}-d-1},\ldots,\frac{{\Lambda}_{k,d}^{j}}{{\kappa}_{k,d}^{j}-d-1}\right) $$

Note that Eqs. (37), (38), (41), and (42) are coupled. The fixed-point algorithm is used to proceeded alternatively the state and noise parameters until the convergence is reached. It has been proved that VB converge very fast and most of the time, only a few iterations in [17].

By using the Bayes rule, the probability of each model is updated

$$ {\begin{aligned} \mu_{k}^{j} &= p\left(r_{k}^{j}|{\mathcal{M}_{k}},{Z_{k}}\right) =\frac{p\left(r_{k}^{j}|{z_{k}}\right)p\left(z_{k}|r_{k}^{i},{z_{k-1}}\right)}{p(z_{k}|z_{k-1})}\\ &= \frac{{\hat{\mu}_{k - 1}^{j}L_{k}^{j}}}{{\sum_{i=1}^{\mathcal{M}_{k}}}{\hat{\mu}_{k - 1}^{i}L_{k}^{i}} } \end{aligned}} $$
(43)

Here, the likelihood function \(L_{k}^{j}\) of model j is calculated

$$ {\begin{aligned} L_{k}^{j} &= p\left({z_{k}}|r_{k}^{j},{\mathcal{M}_{k - 1}},{Z_{k - 1}}\right)\\ &= \mathcal{N}\left({z_{k}};H_{k}^{j} \hat{x}_{k|k - 1}^{j},{\left(H_{k}^{j}\right)^{T}}P_{k}^{j}H_{k}^{j} + \hat{\Sigma}_{k}^{j}\right) \end{aligned}} $$
(44)

Finally, based on the mixed equation, the state and the covariance in the fusion stage are shown as

$$ {\hat{x}_{k|k}} = E\left[{x_{k}}|{\mathcal{M}_{k}},{z_{k}}\right] = \sum\limits_{j = 1}^{{\mathcal{M}_{k}}} {\mu_{k}^{j}\hat{x}_{k}^{j}} \\ $$
(45)
$$ \begin{aligned} {P_{k|k}} &= E\left[\left({x_{k}} - {\hat{x}_{k|k}}\right){\left({x_{k}} - {\hat{x}_{k|k}}\right)^{T}}|{\mathcal{M}_{k}},{z_{k}}\right] \\ &= \sum\limits_{j = 1}^{{\mathcal{M}_{k}}} {\mu_{k}^{j}\left(P_{k}^{j} + \left({{\hat{x}}_{k|k}} - \hat{x}_{k}^{j}\right){{\left({{\hat{x}}_{k|k}} - \hat{x}_{k}^{j}\right)}^{T}}\right)} \end{aligned} $$
(46)

The computational complexity of the model set adaptive filter is O(ln3), where n is the dimension of the state and l is the number of the model set adaptive filter. The estimation of the measurement noise in VB step involves the following parameters: O(ld+l) for the posteriors of the means, and O(ld2+l) for the posteriors of precision, where d is the dimension of the measurement. So the computational complexity of the proposed algorithm is O(Nt[ln3+l+ld+l+ld2])=O(Nt[ln3+ld2]), where Nt is the total number of VB iterations. A description of the proposed algorithm is summarized in the following:

5 Simulation results

In this section, numerical simulations are carried out in order to compare the performance of the five algorithms: IMM3 (3 basic models), IMM11 (3 basic models and 8 CT models), IMM3-VB (3 basic models), IMM11-VB (3 basic models and 8 CT models), and MSA-VB (3 basic models and 8 CT models as the candidate models) with unknown measurement noise and system model. The reference system dynamic can be described by the following state space model

$$ {x_{k+1}} = \left[\begin{array}{cccc} 1&{\frac{{\sin (\omega t)}}{\omega }}&0&{\frac{{\cos(\omega t) - 1}}{\omega}}\\ 0&{\cos (\omega t)}&0&{ - \sin (\omega t)}\\ 0&{\frac{{1 - \cos (\omega t)}}{\omega }}&1&{\frac{{\sin(\omega t)}}{\omega}}\\ 0&{\sin (\omega t)}&0&{\cos (\omega t)} \end{array}\right] {x_{k}} + {w_{k}} $$
(47)

where \({x_{k}} = {\left [ {{\phi _{k}}} \quad {{{\dot \phi }_{k}}} \quad {{\varphi _{k}}} \quad {{{\dot \varphi }_{k}}} \right ]^{T}}, {\phi _{k}}\) and \({\dot \phi _{k}}\) are the target position and velocity in the X-directions, respectively. φk and \({\dot \varphi _{k}}\) are the target position and velocity in the Y-directions, respectively. ω denotes the turn rate. t is the sampling time, and t=1s. The root mean square error (RMSE) of position and velocity are used to evaluate the performance of the proposed algorithm.

For the proposed algorithm, the system state is estimated based on the model set with the basic model set \({\mathcal {M}^{b}}\) and the candidate model set \({\mathcal {M}^{c}}\). The number of \({\mathcal {M}^{b}}\) and \({\mathcal {M}^{c}}\) are 3 and 8, respectively. Those models in the model set are constant turn (CT) models [1]. These models differ only in the turn rate ωi, which belongs to the basic model set \({\mathcal {M}^{b}}\) that consists of the initial models of the IMM algorithm and the candidate model set \({\mathcal {M}^{c}}\) that consists of these models with different structures compared with the basic models.

5.1 Experiment 1

In this case, a single target moves in a 2-D scenario with [−100,100]m×[−100,100]m surveillance region. The turn rate in different time periods is shown in Table 1.

Table 1 The turn rate in different time periods

The measurement equation is expressed as

$$ {z_{k}} =\left[\begin{array}{cccc} 1&0&0&0\\ 0&0&1&0 \end{array}\right]{x_{k}} + {v_{k}} $$
(48)

where measurement noise vk is the zero-mean Gaussian distribution with the unknown covariance matrix Σk. The initial degree of freedom and symmetric positive definite matrix are κ0,i=5(i=1,2),Λ0=diag{20,20}. The turn rates ωi,ωj belong to the following set

$$\begin{array}{*{20}l} {\omega_{i}} \in \left(- 6.67 \times {10^{- 4}},0,6.67 \times {10^{- 4}}\right) \quad {{\omega_{i}} \in {\mathcal{M}^{b}}}\\ {\omega_{j}} \in \pi \times {10^{- 3}} \times (\pm4.2, \pm5.6, \pm8.3, \pm16.7) \quad {{\omega_{j}} \in {\mathcal{M}^{c}}} \end{array} $$

The transition probability matrix (TPM) of the IMM11 \(\Pi ^{11} = {(\pi _{i,j}^{11})_{11 \times 11}}\) is extended from the TPM of the basic models set. The TPM \(\Pi _{b}^{3} = {(\pi _{i,j}^{3})_{3 \times 3}}\) is shown as

$$ \Pi_{b}^{3} = \left[\begin{array}{ccc} 0.8 & 0.1 & 0.1\\ 0.1 & 0.8 & 0.1\\ 0.1 & 0.1 & 0.8 \end{array}\right] $$
(49)
$$ \pi_{i,j}^{11}= \left\{\begin{array}{ll} \left\{\begin{array}{ll} \pi_{i,j}^{3}-a \ \ \ \ \ \ \ \ {i = j} \\ \pi_{i,j}^{3} \ \ \ \ \ \ \ \ \ \ \ \ \ \ {i \ne j} \end{array}\right. \ \ \ \ i\le 3,j \le 3\\ a/8 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ i \le 3,j > 3 \\ 1-10b \ \ \ \ \ \ \ \ {i = j} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ i > 3 \\ b \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ {i \ne j} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ i > 3 \end{array}\right. $$
(50)

where i,j=1,…,11, and a=b=0.01.

As can be seen from the RMSEs of position in Figs. 1 and 2. The RMSE of MSA-VB is lower than that of IMM3-VB. Note that the IMM3 and IMM11 perform a bit better than the IMM3-VB, IMM11-VB, and MSA-VB at the beginning (t<25). The reason for this is VB eliminates the error between initial noise variance and true noise variance by using the iterative computation. The RMSEs of velocity in Figs. 3 and 4. From these figures it can be observed that the RMSEs of velocity of the IMM11-VB, MSA-VB outperform that of the IMM3, IMM11, and IMM3-VB. During the maneuver, MSA also outperforms the other four algorithms.

Fig. 1
figure 1

RMSEs in X position of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 1

Fig. 2
figure 2

RMSEs in Y position of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 1

Fig. 3
figure 3

RMSEs in X Velocity of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 1

Fig. 4
figure 4

RMSEs in Y Velocity of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 1

In [29] and [23], the forgetting factor ρ is chosen empirically. The average RMSEs of the state versus different value of ρ are shown in Fig. 5. Simulation results show that the proposed algorithm performs better when ρ=0.92.

Fig. 5
figure 5

RMSEs in position with different values of the forgetting factor

The RMSEs of the five algorithms are given in Table 2 with different measurement noise variance parameters. The variance parameters are chosen as 5, 10, 20, and 50, respectively. From Table 2, with increasing levels of noise variance parameters, the proposed algorithm and the IMM11-VB have smaller RMSEs among these algorithms for all measurement noise variance parameters. Compared with the IMM3, IMM3-VB, IMM11, and IMM11-VB, the average RMSEs of the IMM3 and IMM3-VB are larger than that of the IMM11 and IMM11-VB. That is because more models are adapted for maneuvering target tracking. Due to VB effectively estimating the noise variance parameters, the average RMSEs of the IMM3-VB and IMM11-VB is smaller than that of the IMM3 and IMM11, respectively.

Table 2 Average RMSEs in position with different measurement noise variance parameters for the Experiment 1

The comparison of relative computational times are shown in Table 3. The CPU time needed for IMM11-VB is three times the CPU time of MSA-VB.

Table 3 Relative computational times for the Experiment 1

5.2 Experiment 2

In this scenario, a bearings-only tracking (BOT) problem is considered. The target located at coordinate [0,0]m. The initial position of observation platform is [500 m, 20 m/s, 800 m, 10 m/s]. First, the observation platform moves in constant velocity (CV) model for 100 s, then moves in CT model with duration 50 s and the turn rate ω=−0.0232(rad/s), and finally moves in CV model for 100 s. The measurement model at the time k is

$$ {\vartheta_{k}} = {\tan^{- 1}}\left(\frac{{\phi_{k}^{o} - \phi_{k}^{t}}}{{\varphi_{k}^{o} - \varphi_{k}^{t}}}\right) + {v_{k}} $$
(51)

The turn rates of the basic model set \({\mathcal {M}^{b}}\) and the candidate model set \({\mathcal {M}^{c}}\) are

$$\begin{array}{*{20}l} {\omega_{i}} \in \pi \times {10^{- 3}} \times (- 1,0,1) \quad {{\omega_{i}} \in {\mathcal{M}^{b}}}\\ {\omega_{j}} \in \pi \times {10^{- 4}} \times (\pm0.5, \pm0.667, \pm1, \pm2) \quad {{\omega_{j}} \in {\mathcal{M}^{c}}} \end{array} $$

The TPMs are shown in Eqs. (49) and (50).

The RMSEs of X and Y position and velocity over 200 Monte Carlo runs are shown in Figs. 6, 7, 8 and 9, respectively. It can be observed that the proposed algorithm performs a bit better than the other four algorithms, and the RMSEs of IMM11 and IMM11-VB become larger around the time steps where the true measurement noise variances are jump changes. As can be seen from Fig. 7, the RMSE of IMM11 is shocked in the noise varying process. The reason is that the number of dimensions of measurement vector is lower than that of state vector. The result of variance estimation is shown in the Fig. 10. We can see that the proposed algorithm is effective in the estimation of the measurement noise statistics with some penalty of time delay. This is because the old noise value will be contained in part from the prior time step to the next time step. The RMSEs of the five algorithms are given in Table 4 with different measurement noise variance parameters. The variance parameters are chosen as 0.0001, 0.0005, 0.001, and 0.01, respectively. It can be seen from Table 4 that the average RMSEs increase as the noise variance parameters increase. Compared to the IMM3, IMM3-VB, IMM11, and IMM11-VB, the proposed algorithm has higher tracking accuracy.

Fig. 6
figure 6

RMSEs in X position of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 2

Fig. 7
figure 7

RMSEs in Y position of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 2

Fig. 8
figure 8

RMSEs in X velocity of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 2

Fig. 9
figure 9

RMSEs in Y velocity of the proposed algorithm, IMM3, IMM11, IMM3-VB, and IMM11-VB in scenario 2

Fig. 10
figure 10

Estimation result of Σk

Table 4 Average RMSEs in position with different measurement noise variance parameters for the experiment 2

The comparison of relative computational times are shown in Table 5. The CPU time of the IMM3 denotes a unit. It is obvious that MSA-VB needs less computational time than IMM11-VB.

Table 5 Relative computational times for the experiment 2

6 Discussion

We have shown that the proposed algorithm can mitigates the effects of system model and noise statistics uncertainties in the target track system. We found that the proposed algorithm can effectively estimate the target state, as demonstrated in the numerical simulations. The estimation accuracy was examined by comparing the proposed algorithm, the IMM-VB and IMM methods. The average RMSEs of the positions and velocities of the proposed algorithm are smaller than that of the other algorithms. Our results show an improved estimation and tracking performance compared to the IMM-VB and IMM methods. In addition, we found that the computational complexity of the proposed method is relatively higher than the IMM3-VB and IMM methods. The reason is that most of its computational time is spent on reconstructing the adapted model set and calculating the noise parameters by using the VB method. It should be noted that this study concentrates on only the single sensor target tracking with measurement noise uncertainty. Hence, we currently focus on extending it to multi-sensor target tracking with unknown measurement and process noise.

7 Conclusions

In this paper, we present an adaptively robust filter to address the performance degradation of the IMM-VB in the presence of system model and noise statistics uncertainties. The main contribution of this paper is that a MSA method is designed to choose the best match model by calculating the divergence between the candidate models and true mode. Based on the chosen model, the model-conditioned estimation based on variational Bayesian approximation is proposed to estimate the system state and noise parameters. The performance of the MSA-VB is evaluated over the different target tracking scenes. The RMSE for the positions and velocities are presented which shows higher accuracy compared with the IMM-VB and IMM methods.