Facing undermodelling in Sign-Perturbed-Sums system identification

https://doi.org/10.1016/j.sysconle.2021.104936Get rights and content

Abstract

Sign-Perturbed Sums (SPS) is a finite sample system identification method that constructs exact, non-asymptotic confidence regions for the unknown parameters of linear systems without using any knowledge about the disturbances except that they are symmetrically distributed. In the available literature, the theoretical properties of SPS have been investigated under the assumption that the order of the system model is known to the user. In this paper, we analyse the behaviour of SPS when the model assumed by the user does not match the data generation mechanism, and we propose a new SPS algorithm able to detect the circumstance that the model order is incorrect.

Introduction

Estimating parameters of partially unknown systems based on observations corrupted by noise is a fundamental problem in system identification, signal processing and statistics at large, with an impact on control and prediction methods in machine learning,  [1], [2], [3]. Several standard approaches, such as the Least Squares (LS) method or, more generally, prediction error methods, can be employed to obtain point estimates of the unknown parameters. In many situations, for example when the safety, stability or quality of a process has to be guaranteed, a point estimate should be accompanied with an uncertainty region that certifies the accuracy of the estimate. In standard statistical system identification approaches, confidence regions are constructed based on theoretical analyses that, typically, have only asymptotic validity, while guarantees that are valid for a finite sample of observations require strong assumptions on the data generation mechanism. Along a different route, finite sample results can be obtained by resorting to worst-case identification approaches where the noise is guaranteed to belong to a bounded set, see e.g. [4], [5], [6], [7]. In this approach, a region for the unknown system parameters can be constructed by including in the region those parameter vectors that are consistent with the observed data given all the possible realizations of the noise in the noise bounding set. Such methodologies construct uncertainty regions that are robust, but often conservative in practice. These issues have recently led to studies, see e.g. [8] and the references therein, where the worst-case approach is complemented with statistical knowledge to reduce conservatism, and, on a different line of research, to a class of statistical finite sample identification algorithms (see, e.g., [9], [10], [11], [12], [13], [14], [15], [16], [17], [18] and [19] for a recent overview) that (i) do not rely on noise bounding sets and (ii) are robust to statistical uncertainties.

This paper focuses on one of these statistical, finite sample algorithm: the Sign-Perturbed Sums (SPS) algorithm, [12], [20]. SPS constructs confidence regions around the Least Squares Estimate (LSE). In the case of Finite Impulse Response (FIR) systems, several important properties of SPS have been proven rigorously. In particular, we recall here that the regions constructed by SPS have an exact coverage probability (i.e., they include the true parameter vector with an exact and user-chosen probability) independently of the specific, unknown, distribution of the noise, which is only assumed to be symmetric and forms an independent sequence. Moreover, the SPS regions are strongly consistent [20].

All the known properties of SPS, however, have been derived under the assumption that the true data generating system belongs to the model class considered by the user, with known model order (at least, an upper-bound on the model order should be known). In system identification practice, this information typically comes from domain knowledge or from a model order selection step. Model order selection is a standard topic in the system identification literature and various supporting tools are available to the user, see e.g., [1], [3], [21], and [22], [23], [24], [25] for more recent contributions. However, this is still a difficult and theoretically challenging problem, which might end up with an underestimate of the model order, [26], [27], [28], [29].

In this paper, we study the behaviour of SPS in the presence of undermodelling and argue that, with undermodelling, the SPS regions, which are no more exact, may induce a false sense of confidence in the incautious user. Thus, we introduce a modified version of SPS with the following properties:

  • if the system is not undermodelled, the algorithm builds exact, non-asymptotic confidence regions for the true model parameter vector;

  • if the system is undermodelled, the algorithm has a propensity to warn the user that there is a mismatch between the data generation mechanism and the postulated model class (see the simulation section for practical examples) and, moreover, it certainly detects undermodelling when the number of data points becomes large (see the asymptotic Theorem 5).

After a brief discussion of the related literature in the following Section 1.2, we review the standard SPS method in Section 2. Standard SPS in the presence of undermodelling is analysed in Section 3 and this theoretical analysis will provide a motivation for the new algorithm UD-SPS (SPS with Undermodelling Detection), which we introduce and study in Section 4. Computational aspects of UD-SPS are discussed in Section 5. An illustration on a numerical example is offered in Section 6. Section 7 presents our conclusions and outlines some future directions of research.

Although a simulation experiment on the effect of undermodelling on SPS was carried out in [12], the available literature on the theory of SPS does not consider this possibility. If we look at the broader class of statistical finite sample identification methods, the algorithm in [30] allows the user to estimate a subset of the unknown system parameters, which makes the algorithm applicable also when the true model order is higher than the selected one. However, differently from SPS, the algorithm in [30] does not build regions around the LS estimate, and, most importantly, it can be applied only if the measurable input is known to satisfy (or can be chosen so as to satisfy) precise statistical properties (e.g., the input has to be a symmetric white process, or a filtered version of it through a known filter). In this paper, instead, no assumptions will be made on the input except for standard excitation conditions. The conference paper [31] contains a preliminary exposition of the ideas of this paper, which are here revised, developed and accompanied with rigorous theoretical derivations that were not given in [31]; the discussion on the practical and computational aspects and the numerical examples are also new.

Section snippets

The SPS algorithm

In this section we summarize the basic ideas behind the SPS algorithm, and recall the fundamental theorem about its exact confidence and its asymptotic properties. The concepts and results of this section are presented more comprehensively and made precise in [12] and [20]. The reader is referred to these papers for more details and comparisons with related methodologies. For an overview of SPS in the context of statistical finite sample methods, see also [19].

SPS in the presence of undermodelling

From the previous section we know that a crucial role in theSPS idea is played by the condition that {YtφtTθ}={Nt}, i.e., the actual noise sequence is reconstructed from the measured data when the true parameter is correctly guessed. This is obviously true when the true system structure is known, which is the standard assumption in the SPS literature.

In this section, we study the behaviour of the SPS algorithm in a more general setting which includes the possibility that the true data

UD-SPS: A modified SPS method

We here define the UD-SPS algorithm, and discuss the main idea behind it. Clarifying the connection between UD-SPS and the standard SPS makes it easy to prove that UD-SPS inherits the most important properties of standard SPS when the system is correctly specified (matching case). Then, we show that UD-SPS can be used to detect undermodelling.

Computational aspects of UD-SPS and practical usage

The UD-SPS pseudocode of Section 4 allows one to easily verify whether a certain value of θ is inside or outside the UD-SPS region. On the other hand, in system identification practice, one often desires to construct the whole region to which the system parameter vector has to belong, which may take a long time if each single point in a grid has to be examined. Moreover, with UD-SPS, it is important to check whether the region is empty, which might be a challenging task to be accomplished via a

A numerical experiment

The true data generation mechanism is the following FIR(2) system Yt=0.75Ut10.3Ut2+Nt,where {Nt} is a sequence of i.i.d. Laplacian random variables with zero mean and variance 0.1. The input signal is generated as Ut=0.5Ut1+Vt,where {Vt} is a sequence of i.i.d. Gaussian random variables with zero mean and variance 1, independent of Nt.

The simple identification problem that we provide here to illustrate some basic facts of UD-SPS is described as follows: the sequence Y1,,Yn, n=50, is

Conclusions and future work

In this paper we have extended the study of SPS, a statistical, guaranteed, finite sample system identification method, to the case of undermodelled dynamics. While undermodelled dynamics is not covered by the standard SPS algorithm, we have also shown that the SPS algorithm does not contain any mechanism to warn the user that the algorithm is working outside of its domain of applicability.

We have then proposed an extension of SPS, called UD-SPS, which is able to detect that the algorithm is

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

A. Carè and B. Cs. Csáji were (partially) supported by the European Commission through the H2020 project Centre of Excellence in Production Informatics and Control (EPIC CoE, Grant No. 739592). B. Cs. Csáji was also supported by the Ministry of Innovation and Technology, Hungary, NRDI Office, within the framework of the Autonomous Systems National Laboratory Program. A. Carè and M.C. Campi were partially supported by MIUR and the University of Brescia under project CLAFITE.

The authors would

References (40)

  • BrooksR.J. et al.

    Choosing the best model: Level of detail, complexity, and model performance

    Math. Comput. Modelling

    (1996)
  • CampiM.C. et al.

    Non-asymptotic confidence regions for model parameters in the presence of unmodelled dynamics

    Automatica

    (2009)
  • CarèA. et al.

    Undermodelling detection with sign-perturbed sums

    IFAC-PapersOnLine

    (2017)
  • WahlbergB. et al.

    On approximation of stable linear dynamical systems using Laguerre and Kautz functions

    Automatica

    (1996)
  • LjungL.

    System Identification: Theory for the User

    (1999)
  • SöderströmT. et al.

    System Identification

    (1989)
  • MilaneseM. et al.

    Bounding Approaches To System Identification

    (2013)
  • DabbeneF. et al.

    Probabilistic optimal estimation with uniformly distributed noise

    IEEE Trans. Automat. Control

    (2014)
  • GranichinO.N.

    The nonasymptotic confidence set for parameters of a linear control object under an arbitrary external disturbance

    Autom. Remote Control

    (2012)
  • CsájiB.Cs. et al.

    Sign-perturbed sums: A new system identification approach for constructing exact non-asymptotic confidence regions in linear regression models

    IEEE Trans. Signal Process.

    (2015)
  • Cited by (12)

    • Bayesian frequentist bounds for machine learning and system identification

      2022, Automatica
      Citation Excerpt :

      Our bounds have been obtained by designing a novel Bayesian frequentist framework that deeply expands the original SPS approach introduced in (Csáji et al., 2015). Many other developments of this work are however still possible in several directions, e.g., by combining BFB with undermodelling detection mechanisms (Carè, Campi, Csáji, & Weyer, 2021) and by extending the BFB approach to the case of non-exogenous regressors where the regression matrix may depend on past outputs (Carè et al., 2018; Volpe et al., 2015). This will provide a basis to obtain exact bounds for more complex systems.

    • A simple condition for the boundedness of Sign-Perturbed-Sums (SPS) confidence regions

      2022, Automatica
      Citation Excerpt :

      The interested reader is referred to Carè, Campi, Csáji, and Weyer (2021), Csáji et al. (2015), Kieffer and Walter (2013), Weyer, Campi, and Csáji (2017) for more details on SPS and its theoretical and computational aspects, and to Carè, Csáji, Campi, and Weyer (2018) for an overview of related methods.

    View all citing articles on Scopus
    1

    All authors contributed in equal part to all aspects of this paper.

    View full text