Approximability models and optimal system identification

Ettehad, Mahmood; Foucart, Simon

doi:10.1007/s00498-020-00253-z

Approximability models and optimal system identification

Original Article
Published: 13 February 2020

Volume 32, pages 19–41, (2020)
Cite this article

Mathematics of Control, Signals, and Systems Aims and scope Submit manuscript

Mahmood Ettehad¹ &
Simon Foucart¹

291 Accesses
1 Citation
Explore all metrics

Abstract

This article considers the problem of optimally recovering stable linear time-invariant systems observed via linear measurements made on their transfer functions. A common modeling assumption is replaced here by the related assumption that the transfer functions belong to a model set described by approximation capabilities. Capitalizing on recent optimal recovery results relative to such approximability models, we construct some optimal algorithms and characterize the optimal performance for the identification and evaluation of transfer functions in the framework of the Hardy Hilbert space and of the disk algebra. In particular, we determine explicitly the optimal recovery performance for frequency measurements taken at equispaced points on an inner circle or on the torus.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Beurling-Type Density Criteria for System Identification

Article Open access 26 July 2023

Céline Aubel, Helmut Bölcskei & Verner Vlačić

Signal and System Approximation from General Measurements

System Approximations and Generalized Measurements in Modern Sampling Theory

Notes

In the case ${\mathcal {K}}= {\mathcal {H}}_2({\mathbb {D}})$, the optimal algorithm over an ellipsoidal model set such as the one described by (9) is linear, too. It is a variant of the minimal-norm interpolation presented in Section 5.3. of [13].
To compute the ${\mathcal {H}}_2$-error between functions $F = \sum _{j=1}^\infty b_j V_j$ and $\widetilde{F} = \sum _{j=1}^n c_j V_j + \sum _{k=1}^m d_k L_k$, we used the fact that $\Vert F -\widetilde{F} \Vert _{{\mathcal {H}}_2}^2 = \Vert F\Vert _{{\mathcal {H}}_2}^2 + \Vert \widetilde{F}\Vert _{{\mathcal {H}}_2}^2 - 2 {\text {Re}}\langle F, \widetilde{F} \rangle $, together with $\Vert \widetilde{F}\Vert _{{\mathcal {H}}_2}^2 = \Vert c\Vert _2^2 + \langle d, H d \rangle + 2 {\text {Re}}\langle c, G d \rangle $ and $\langle F, \widetilde{F} \rangle = \langle b_{1:n}, c \rangle + \sum _{k=1}^m {\overline{d_k}} \ell _k(F)$.

References

Akcay H, Gu G, Khargonekar PP (1994) Identification in ${\cal{H}}_{\infty }$ with nonuniformly spaced frequency response measurements. Int J Robust Nonlinear Control 4(4):613–629
Article MathSciNet Google Scholar
Binev P, Cohen A, Dahmen W, DeVore R, Petrova G, Wojtaszczyk P (2017) Data assimilation in reduced modeling. SIAM/ASA J Uncertain Quantif 5(1):1–29
Article MathSciNet Google Scholar
Bokor J, Schipp F, Gianone L (1995) Approximate ${\cal{H}}_{\infty }$ identification using partial sum operators in disc algebra basis. In: Proceedings of American control conference, pp 1981–1985
Campi MC, Weyer E (2002) Finite sample properties of system identification methods. IEEE Trans Autom Control 47(8):1329–1334
Article MathSciNet Google Scholar
Chen J, Gu G (2000) Control-oriented system identification: an ${\cal{H}}_\infty $ approach, vol 19. Wiley, Hoboken
Google Scholar
Chen J, Nett CN (1993) The Caratheodory–Fejer problem and ${\cal{H}}_{\infty }$ identification: a time domain approach. In: Proceedings of the 32nd IEEE conference on decision and control, pp 68–73
Chen J, Nett CN, Fan MKH (1992) Worst-case system identification in ${\cal{H}}_{\infty }$: validation of a priori information, essentially optimal algorithms, and error bounds. In: Proceedings of American control conference, pp 251–257
Chen J, Nett CN, Fan MKH (1992) Optimal non-parametric system identification from arbitrary corrupt finite time series: a control-oriented approach. In: Proceedings of American control conference, pp 279–285
DeVore R, Foucart S, Petrova G, Wojtaszczyk P (2019) Computing a quantity of interest from observational data. Constr Approx 49(3):461–508
Article MathSciNet Google Scholar
DeVore R, Petrova G, Wojtaszczyk P (2017) Data assimilation and sampling in Banach spaces. Calcolo 54(3):963–1007
Article MathSciNet Google Scholar
Helmicki AJ, Jacobson CA, Nett CN (1991) Control oriented system identification: a worst-case/deterministic approach in ${{\cal{H}}}_\infty $. IEEE Trans Autom Control 36(10):1163–1176
Article MathSciNet Google Scholar
Micchelli C, Rivlin T (1977) A survey of optimal recovery. Optimal estimation in approximation theory. Springer, Boston, pp 1–54
Chapter Google Scholar
Partington JR (1997) Interpolation, identification, and sampling. Oxford University Press, Oxford
MATH Google Scholar
Rakhmanov E, Shekhtman B (2006) On discrete norms of polynomials. J Approx Theory 139(1–2):2–7
Article MathSciNet Google Scholar
Shah P, Bhaskar BN, Tang G, Recht B (2012) Linear system identification via atomic norm regularization. arXiv preprint arXiv:1204.0590
Tu S, Boczar R, Packard A, Recht B (2017) Non-asymptotic analysis of robust control from coarse-grained identification. arXiv preprint arXiv:1707.04791
Vidyasagar M, Karandikar RL (2008) A learning theory approach to system identification and stochastic adaptive control. J Process Control 18(3):421–430
Article Google Scholar
Zhou K, Doyle JC (1998) Essentials of robust control. Prentice Hall, Upper Saddle River
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Texas A&M University, College Station, USA
Mahmood Ettehad & Simon Foucart

Authors

Mahmood Ettehad
View author publications
You can also search for this author in PubMed Google Scholar
Simon Foucart
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Foucart.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Simon Foucart is partially supported by NSF Grants DMS-1622134 and DMS-1664803, and also acknowledges the NSF Grant CCF-1934904.

Appendix: Proofs of essential results

In this section, we fully justify some statements appearing in the text but not yet established, namely the relation between (9)–(10) and (11), as well as the validity in the complex setting of results about optimal identification in Hilbert spaces [2] and optimal estimation in Banach spaces [9]. We start with how (11) connects to the descriptions (9)–(10) of the models put forward in [11].

Proposition 6

With ${\mathcal {X}}$ denoting either ${\mathcal {H}}_2({\mathbb {D}})$ or ${\mathcal {A}}({\mathbb {D}})$, the following properties are equivalent:

$$\begin{aligned}&\text{ there } \text{ exist } \rho>1 \text{ and } M>0 \text{ such } \text{ that } \Vert F(\rho \, \cdot )\Vert _{{\mathcal {X}}} \le M; \end{aligned}$$

(74)

$$\begin{aligned}&\text{ there } \text{ exist } \rho>1 \text{ and } M>0 \text{ such } \text{ that } {\mathrm{dist}}_{{\mathcal {X}}}(F,{\mathcal {P}}_n) \le M \rho ^{-n} \text{ for } \text{ all } n \ge 0. \end{aligned}$$

(75)

Proof

We write $F(z) = \sum _{n=0}^\infty f_n z^n$ throughout the proof. We first establish the equivalence in the case ${\mathcal {X}}= {\mathcal {H}}_2({\mathbb {D}})$. Let us assume that (74) holds, i.e., that $\sum _{n=0}^\infty |f_n|^2 \rho ^{2n} \le M^2$ for some $\rho >1$ and $M>0$. In particular, we have $|f_n|^2 \le M^2 \rho ^{-2n}$ for all $n \ge 0$. It follows that, for all $n \ge 0$,

$$\begin{aligned} {\mathrm{dist}}_{{\mathcal {H}}_2}(F,{\mathcal {P}}_n)^2 = \sum _{k=n}^\infty |f_k|^2 \le M^2 \sum _{k=n}^\infty \rho ^{-2k} = \frac{M^2}{1-\rho ^{-2}} \rho ^{-2n}; \end{aligned}$$

(76)

hence, (75) holds with a change in the constant M. Conversely, let us assume that (75) holds, i.e., that there are $\rho > 1$ and $M>0$ such that $\sum _{k=n}^\infty |f_k|^2 \le M^2 \rho ^{-2n}$ for all $n \ge 0$. In particular, we have $|f_n|^2 \le M^2 \rho ^{-2n}$ for all $n \ge 0$. Then, picking $\widetilde{\rho } \in (1,\rho )$, we derive that

$$\begin{aligned} \sum _{n=0}^\infty |f_n|^2 {\widetilde{\rho }\,}^{2n} \le M^2 \sum _{n=0}^\infty (\widetilde{\rho }/\rho )^{2n} = \frac{M^2}{1-(\widetilde{\rho }/\rho )^2}; \end{aligned}$$

(77)

hence, (74) holds with a change in both $\rho $ and M.

We now establish the equivalence in the case ${\mathcal {X}}= {\mathcal {A}}({\mathbb {D}})$. Let us assume that (74) holds, i.e., that $\sup _{|z| = \rho } |F(z)| \le M$ for some $\rho > 1$ and $M>0$. This implies that the Taylor coefficients of F satisfy, for any $k \ge 0$,

$$\begin{aligned} |f_k| = \bigg | \frac{1}{2\pi i} \int _{|z|=\rho } \frac{F(z)}{z^{k+1}} dz \bigg | \le \frac{1}{2 \pi }\times \frac{M}{\rho ^{k+1}} \times 2 \pi \rho = M \rho ^{-k}. \end{aligned}$$

(78)

Considering $P \in {\mathcal {P}}_n$ defined by $P(z):= \sum _{k=0}^{n-1} f_k z^k$, we obtain

$$\begin{aligned} {\mathrm{dist}}_{{\mathcal {X}}}(F,{\mathcal {P}}_n)\le & {} \Vert F-P\Vert _{{\mathcal {H}}_\infty } = \sup _{|z|=1} \bigg | \sum _{k=n}^\infty f_k z^k \bigg | \le \sum _{k=n}^\infty |f_k| \nonumber \\\le & {} M \sum _{k=n}^\infty \rho ^{-k} = \frac{M}{1-\rho } \rho ^{-n}; \end{aligned}$$

(79)

hence, (75) holds with a change in the constant M. Conversely, let us assume that (75) holds, i.e., that there are $\rho > 1$ and $M>0$ such that there exists, for each $n\ge 0$, a polynomial $P^{[n]} \in {\mathcal {P}}_n$ with $\Vert F-P^{[n]}\Vert _{{\mathcal {H}}_\infty } \le M \rho ^{-n}$. For all $n \ge 0$, since the coefficients in $z^n$ of F are the same as that of $F-P^{[n]}$, we have

$$\begin{aligned} |f_n|= & {} \bigg | \frac{1}{2 \pi i} \int _{|z|=1} \frac{(F-P^{[n]})(z)}{z^{n+1}} dz \bigg | \nonumber \\\le & {} \frac{1}{2 \pi } \times \Vert F-P^{[n]}\Vert _{{\mathcal {H}}_\infty } \times 2 \pi \le M \rho ^{-n}. \end{aligned}$$

(80)

Then, picking $\widetilde{\rho } \in (1,\rho )$, we derive that

$$\begin{aligned} \sup _{|z|=\widetilde{\rho }} |F(z)| \le \sum _{n=0}^\infty |f_n| {\widetilde{\rho }\,}^n \le M \sum _{n=0}^\infty (\widetilde{\rho }/\rho )^n = \frac{M}{1-\widetilde{\rho }/\rho }; \end{aligned}$$

(81)

hence, (74) holds with a change in both $\rho $ and M. $\square $

We now turn to the justification for the complex setting of the results from [2] about optimal identification in Hilbert spaces. As in [2, Theorem 2.8], these results are easy consequences of the following statement.

Theorem 7

Let ${\mathcal {V}}$ be a subspace of a Hilbert space ${\mathcal {X}}$, and let $\ell _1,\ldots ,\ell _m$ be linear functionals defined on ${\mathcal {X}}$. With a model set given by

$$\begin{aligned} {\mathcal {K}}= \{ f \in {\mathcal {X}}: {\mathrm{dist}}_{\mathcal {X}}(f,{\mathcal {V}}) \le \varepsilon \}, \end{aligned}$$

(82)

the performance of optimal identification from some $y \in {\mathbb {C}}^m$ satisfies

$$\begin{aligned} \inf _{A: {\mathbb {C}}^m \rightarrow {\mathcal {X}}} \sup _{f \in {\mathcal {K}}\cap {\mathcal {L}}^{-1}(y)} \Vert f - A(y)\Vert _{\mathcal {X}}= \mu \left( \varepsilon ^2 - \min _{f \in {\mathcal {L}}^{-1}(y)} \Vert f - P_{\mathcal {V}}f\Vert _{\mathcal {X}}^2 \right) ^{1/2}, \end{aligned}$$

(83)

where the constant $\mu $ is defined by

$$\begin{aligned} \mu = \sup _{u \in \ker ({\mathcal {L}})} \frac{\Vert u\Vert _{\mathcal {X}}}{{\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}})}. \end{aligned}$$

(84)

Proof

Let $f^\star \in {\mathcal {X}}$ be constructed from $y \in {\mathbb {C}}^m$ via $f^\star := \underset{f \in {\mathcal {X}}}{{\mathrm{argmin}}\,} \Vert f - P_{\mathcal {V}}f\Vert _{\mathcal {X}}$ subject to ${\mathcal {L}}(f) = y$. We shall prove on the one hand that

$$\begin{aligned} \sup _{f \in {\mathcal {K}}\cap {\mathcal {L}}^{-1}(y)} \Vert f - f^\star \Vert _{\mathcal {X}}\le \mu \left( \varepsilon ^2 - \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 \right) ^{1/2} \end{aligned}$$

(85)

and on the other hand that, for any $g \in {\mathcal {X}}$,

$$\begin{aligned} \sup _{f \in {\mathcal {K}}\cap {\mathcal {L}}^{-1}(y)} \Vert f - g\Vert _{\mathcal {X}}\ge \mu \left( \varepsilon ^2 - \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 \right) ^{1/2}. \end{aligned}$$

(86)

Justification of (85): Let us point out that $f^\star - P_{\mathcal {V}}f^\star $ is orthogonal to both ${\mathcal {V}}$ and $\ker ({\mathcal {L}})$. To see this, given $v \in {\mathcal {V}}$, $u \in \ker ({\mathcal {L}})$, and $\theta \in [-\pi ,\pi ]$, we notice that, as functions of $t \in {\mathbb {R}}$, the expressions

$$\begin{aligned} \Vert f^\star - P_{\mathcal {V}}f^\star + t e^{i \theta } v\Vert _{\mathcal {X}}^2&= \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 \nonumber \\&\quad + 2 t {\text {Re}}( e^{- i \theta } \langle f^\star - P_{\mathcal {V}}f^\star , v \rangle ) + {\mathcal {O}}(t^2),\end{aligned}$$

(87)

$$\begin{aligned} \Vert f^\star + t e^{i \theta } u - P_{\mathcal {V}}( f^\star + t e^{ i \theta } u)\Vert _{\mathcal {X}}^2&= \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 \nonumber \\&\quad + 2 t {\text {Re}}( e^{-i \theta } \langle f^\star - P_{\mathcal {V}}f^\star , u - P_{\mathcal {V}}u \rangle ) + {\mathcal {O}}(t^2), \end{aligned}$$

(88)

are minimized at $t=0$. Therefore, ${\text {Re}}( e^{- i \theta } \langle f^\star - P_{\mathcal {V}}f^\star , v \rangle ) =0$ and ${\text {Re}}( e^{- i \theta } \langle f^\star - P_{\mathcal {V}}f^\star , u - P_{\mathcal {V}}u \rangle ) = 0$ for all $\theta \in [-\pi ,\pi ]$. This implies that $\langle f^\star - P_{\mathcal {V}}f^\star , v \rangle =0$ and $\langle f^\star - P_{\mathcal {V}}f^\star , u - P_{\mathcal {V}}u \rangle =0$ for all $v \in {\mathcal {V}}$ and $u \in \ker ({\mathcal {L}})$, hence our claim. Now, consider $f \in {\mathcal {K}}\cap {\mathcal {L}}^{-1}(y)$. Since ${\mathcal {L}}(f) = y = {\mathcal {L}}(f^\star )$, we can write $f = f^\star + u $ for some $u \in \ker ({\mathcal {L}})$. The fact that $f \in {\mathcal {K}}$ then yields

$$\begin{aligned} \varepsilon ^2\ge & {} \Vert f-P_{\mathcal {V}}f\Vert _{\mathcal {X}}^2 = \Vert f^\star -P_{\mathcal {V}}f^\star + u - P_{\mathcal {V}}u\Vert _{\mathcal {X}}^2 \nonumber \\= & {} \Vert f^\star -P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 + \Vert u - P_{\mathcal {V}}u\Vert _{\mathcal {X}}^2, \end{aligned}$$

(89)

so that

$$\begin{aligned} {\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}}) = \Vert u - P_{\mathcal {V}}u\Vert _{\mathcal {X}}\le \left( \varepsilon ^2 - \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 \right) ^{1/2}. \end{aligned}$$

(90)

It remains to take the definition of $\mu $ into account to obtain

$$\begin{aligned} \Vert f - f^\star \Vert _{\mathcal {X}}= \Vert u\Vert _{\mathcal {X}}\le \mu \, {\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}}) \le \mu \, \left( \varepsilon ^2 - \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 \right) ^{1/2}. \end{aligned}$$

(91)

Justification of (86): Let us select $u \in \ker ({\mathcal {L}})$ such that

$$\begin{aligned} \Vert u\Vert _{\mathcal {X}}= \mu \, {\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}}) \qquad \text{ and } \qquad \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 + \Vert u - P_{\mathcal {V}}u\Vert _{\mathcal {X}}^2 = \varepsilon ^2. \end{aligned}$$

(92)

We now consider $f^\pm := f^\star \pm u$. It is clear that $f^\pm \in {\mathcal {L}}^{-1}(y)$, and we also have $f^\pm \in {\mathcal {K}}$, since

$$\begin{aligned} \Vert f^\pm - P_{\mathcal {V}}f^\pm \Vert _{\mathcal {X}}^2= & {} \Vert (f^\star - P_{\mathcal {V}}f^\star ) \pm (u - P_{\mathcal {V}}u ) \Vert _{\mathcal {X}}^2 \nonumber \\= & {} \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 + \Vert u - P_{\mathcal {V}}u \Vert _{\mathcal {X}}^2 = \varepsilon ^2. \end{aligned}$$

(93)

Then, for any $g \in {\mathcal {X}}$,

$$\begin{aligned} \sup _{f \in {\mathcal {K}}\cap {\mathcal {L}}^{-1}(y)} \Vert f - g\Vert _{\mathcal {X}}&\ge \max _{\pm } \Vert f^\pm - g \Vert _{\mathcal {X}}\ge \frac{1}{2} \left( \Vert f^+ - g\Vert _{\mathcal {X}}+ \Vert f^- - g\Vert _{\mathcal {X}}\right) \nonumber \\&\ge \frac{1}{2} \Vert f^+ - f^- \Vert _{\mathcal {X}}\nonumber \\&= \Vert u\Vert _{\mathcal {X}}= \mu \, {\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}}) = \mu \, \left( \varepsilon ^2 - \Vert f^\star - P_{\mathcal {V}}f^\star \Vert _{\mathcal {X}}^2 \right) ^{1/2}. \end{aligned}$$

(94)

This completes the proof of the theorem. $\square $

Finally, we justify below that the result from [9] about optimal estimation in Banach spaces holds in the complex setting, too.

Theorem 8

Let ${\mathcal {V}}$ be a subspace of a Banach space ${\mathcal {X}}$, let $\ell _1,\ldots ,\ell _m$ be linear functionals defined on ${\mathcal {X}}$, and let Q be another linear functional defined on ${\mathcal {X}}$. With a model set given by

$$\begin{aligned} {\mathcal {K}}= \{ f \in {\mathcal {X}}: {\mathrm{dist}}_{\mathcal {X}}(f,{\mathcal {V}}) \le \varepsilon \}, \end{aligned}$$

(95)

the performance of optimal estimation of Q satisfies

$$\begin{aligned} \inf _{A: {\mathbb {C}}^m \rightarrow {\mathbb {C}}} \sup _{f \in {\mathcal {K}}} \left| Q(f) - A({\mathcal {L}}(f)) \right| = \mu \, \varepsilon , \end{aligned}$$

(96)

where the constant $\mu $ equals the minimum of the optimization problem

$$\begin{aligned}&\underset{a \in {\mathbb {C}}^m}{\mathrm{minimize}}\, \bigg \Vert Q - \sum _{k=1}^m a_k \ell _k \bigg \Vert _{{\mathcal {X}}^*} \qquad \text{ subject } \text{ to } \qquad \sum _{k=1}^m a_k \ell _k(v) = Q(v) \nonumber \\&\qquad \text{ for } \text{ all } v \in {\mathcal {V}}. \end{aligned}$$

(97)

Proof

Let $a^\star \in {\mathbb {C}}^m$ be a minimizer of the optimization program (97), and let $\nu $ denote the value of the minimum. Let us also consider

$$\begin{aligned} \mu = \sup _{u \in {\mathcal {U}}} \frac{|Q(u)|}{{\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}})}. \end{aligned}$$

(98)

We shall prove on the one hand that

$$\begin{aligned} \sup _{f \in {\mathcal {K}}} \bigg | Q(f) - \sum _{k=1}^m a^\star _k \ell _k(f) \bigg | \le \nu \, \varepsilon , \end{aligned}$$

(99)

and on the other hand that, for any $A: {\mathbb {C}}^m \rightarrow {\mathbb {C}}$,

$$\begin{aligned} \sup _{f \in {\mathcal {K}}} \left| Q(f) - A({\mathcal {L}}(f)) \right| \ge \mu \, \varepsilon , \end{aligned}$$

(100)

and we shall show as a last step that

$$\begin{aligned} \nu \le \mu . \end{aligned}$$

(101)

Justification of (99): Given $f \in {\mathcal {K}}$, we select $v \in {\mathcal {V}}$ such that $\Vert f-v\Vert _{\mathcal {X}}= {\mathrm{dist}}_{\mathcal {X}}(f,{\mathcal {V}})$. The required inequality follows by noticing that

$$\begin{aligned} \bigg | Q(f) - \sum _{k=1}^m a^\star _k \ell _k(f) \bigg |&= \bigg | Q(f-v) - \sum _{k=1}^m a^\star _k \ell _k(f-v) \bigg | \nonumber \\&\le \bigg \Vert Q- \sum _{k=1}^m a^\star _k \ell _k \bigg \Vert _{{\mathcal {X}}^*} \Vert f-v\Vert _{\mathcal {X}}\nonumber \\&= \nu \, {\mathrm{dist}}_{\mathcal {X}}(f,{\mathcal {V}}) \le \nu \, \varepsilon . \end{aligned}$$

(102)

Justification of (100): Let us select $u \in \ker ({\mathcal {L}})$ such that

$$\begin{aligned} |Q(u)| = \mu \, {\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}}) \qquad \text{ and } \qquad \, {\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}}) = \varepsilon . \end{aligned}$$

(103)

Then, for any $A: {\mathbb {C}}^m \rightarrow {\mathbb {C}}$, we have

$$\begin{aligned} \sup _{f \in {\mathcal {K}}} |Q(f) - A({\mathcal {L}}(f))|&\ge \max _\pm |Q(\pm u) - A(0)| \nonumber \\&\ge \frac{1}{2} \left( |Q(u)-A(0)| + |Q(-u)-A(0)| \right) \nonumber \\&\ge \frac{1}{2} |Q(u) - Q(-u)| = |Q(u)| = \mu \, \varepsilon . \end{aligned}$$

(104)

Justification of (101): We assume that $\ker ({\mathcal {L}}) \cap {\mathcal {V}}= \{ 0\}$, otherwise $\mu = \infty $ and there is nothing to prove. We consider a linear functional $\lambda $ defined on $\ker ({\mathcal {L}}) \oplus {\mathcal {V}}$ by

$$\begin{aligned} \lambda (u) = Q(u) \quad \text{ for } \text{ all } u \in \ker ({\mathcal {L}}) \qquad \text{ and } \qquad \lambda (v) = 0 \quad \text{ for } \text{ all } v \in {\mathcal {V}}. \end{aligned}$$

(105)

We then consider a Hahn–Banach extension $\widetilde{\lambda }$ of $\lambda $ defined on ${\mathcal {X}}$. Because $Q-\widetilde{\lambda }$ vanishes on $\ker ({\mathcal {L}})$, we can write $Q-\widetilde{\lambda } = \sum _{k=1}^n \widetilde{a}_k \ell _k$ for some $\widetilde{a} \in {\mathbb {C}}^m$, and because $\widetilde{\lambda }$ vanishes on ${\mathcal {V}}$, we have $\sum _{k=1}^m \widetilde{a}_k \ell _k(v) = Q(v)$ for all $v \in {\mathcal {V}}$. We therefore derive that

$$\begin{aligned} \nu&\le \bigg \Vert Q - \sum _{k=1}^n \widetilde{a}_k \ell _k \bigg \Vert _{{\mathcal {X}}^*} = \big \Vert \widetilde{\lambda } \big \Vert _{{\mathcal {X}}^*} = \Vert \lambda \Vert _{(\ker ({\mathcal {L}}) \oplus {\mathcal {V}})^*} = \sup _{\begin{array}{c} u \in \ker ({\mathcal {L}})\\ v \in {\mathcal {V}} \end{array}} \frac{| \lambda (u-v) | }{\Vert u-v\Vert _{\mathcal {X}}} \nonumber \\&= \sup _{\begin{array}{c} u \in \ker ({\mathcal {L}})\\ v \in {\mathcal {V}} \end{array}} \frac{| Q(u) |}{\Vert u-v\Vert _{\mathcal {X}}} = \sup _{u \in \ker ({\mathcal {L}})} \frac{|Q(u)|}{{\mathrm{dist}}_{\mathcal {X}}(u,{\mathcal {V}})} = \mu . \end{aligned}$$

(106)

This concludes the proof of the theorem. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ettehad, M., Foucart, S. Approximability models and optimal system identification. Math. Control Signals Syst. 32, 19–41 (2020). https://doi.org/10.1007/s00498-020-00253-z

Download citation

Received: 06 December 2018
Accepted: 04 February 2020
Published: 13 February 2020
Issue Date: March 2020
DOI: https://doi.org/10.1007/s00498-020-00253-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximability models and optimal system identification

Abstract

Access this article

Similar content being viewed by others

Beurling-Type Density Criteria for System Identification

Signal and System Approximation from General Measurements

System Approximations and Generalized Measurements in Modern Sampling Theory

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proofs of essential results

Proposition 6

Proof

Theorem 7

Proof

Theorem 8

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Approximability models and optimal system identification

Abstract

Access this article

Similar content being viewed by others

Beurling-Type Density Criteria for System Identification

Signal and System Approximation from General Measurements

System Approximations and Generalized Measurements in Modern Sampling Theory

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proofs of essential results

Appendix: Proofs of essential results

Proposition 6

Proof

Theorem 7

Proof

Theorem 8

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation