Instances of computational optimal recovery: Refined approximability models☆
Introduction
The objective of this article is to uncover practical methods for the optimal recovery of functions available through observational data when the underlying models based on approximability allow for overparametrization. To clarify this objective and its various challenges, we start with some background on traditional Optimal Recovery. Typically, an unknown function defined on a domain is observed through point evaluations at distinct points . More generally, an unknown object , simply considered as an element of a normed space , is observed through where are linear functionals defined on . We assume here that these data are perfectly accurate — we refer to the companion article [5] for the incorporation of observation error. The data is summarized as , where the linear map is called observation operator. Based on the knowledge of , the task is then to recover a quantity of interest , where throughout this article is assumed to be a linear functional. The recovery procedure can be viewed as a map from to , with no concern for its practicability at this point.
Besides the observational data (which is also called a posteriori information), there is some a priori information coming from an educated belief about the properties of realistic ’s. It translates into the assumption that belongs to a model set . The choice of this model set is of course critical. When the ’s indeed represent functions, it is traditionally taken as the unit ball with respect to some norm that characterizes smoothness. More recently, motivated by parametric partial differential equations, a model based on approximation capabilities has been proposed in [2]. Namely, given a linear subspace of and a threshold , it is defined as This model set is also implicit in many numerical procedures and in machine learning.
Whatever the selected model set, the performance of the recovery procedure is measured in a worst-case setting via the (global) error of over , i.e., Obviously, one is interested in optimal recovery maps minimizing this worst-case error, i.e., such that This infimum is called the intrinsic error of the observation map (for over ). It is known, at least since Smolyak’s doctoral dissertation [13], that there is a linear functional among the optimal recovery maps as soon as the set is symmetric and convex, see e.g. [10, Theorem 4.7] for a proof. The practicality of such a linear optimal recovery map is not automatic, though. For the approximability set (2), Theorem 3.1 of [4] revealed that such a linear optimal recovery map takes the form , where is a solution to an optimization problem that can be solved for in exact form when the observation functionals are point evaluations (see [4]) and in approximate form when they are arbitrary linear functionals (see [5] or Section 3.2).
The approximability set (2), however, presents some important restrictions. Suppose indeed that there is some nonzero . Then, for a given observed through , any , , is both model-consistent (i.e., ) and data-consistent (i.e., ), so that the local error at of any recovery map satisfies which is infinite whenever . Thus, for the optimal recovery problem to make sense under the approximability model (2), independently of the quantity of interest , one must assume that . By a dimension argument, this imposes In other words, we must place ourselves in an underparametrized regime for which the number of parameters describing the model does not exceed the number of data. This contrasts with many current studies, especially in the field of Deep Learning, which emphasize the advantages of overparametrization. In order to incorporate overparametrization in the optimal recovery problem under consideration, we must then restrict the magnitude of model- and data-consistent elements. A glaring strategy consists in altering the approximability set (2). We do so in two different ways, namely by considering a bounded approximability set of the first type, i.e., and a bounded approximability set of the second type, i.e., We will start by analyzing the second type of bounded approximability sets in Section 2 by formally describing the optimal recovery maps before revealing on a familiar example how the associated minimization problem is tackled in practice. The main ingredient in essence belongs to the sum-of-squares techniques from semidefinite programming. Next, we will analyze the first type of bounded approximability sets in Section 3. We will even formally describe optimal recovery maps over more general model sets consisting of intersections of approximability sets. On the prior example, we will again reveal how the associated minimization problem is tackled in practice. This time, the main ingredient in essence belongs to the moment techniques from semidefinite programming. In view of this article’s emphasis on computability issues, all of the theoretical constructions are illustrated in a reproducible matlab file downloadable from the author’s webpage.
Before delving into technicalities, we stress that this article considers a scenario where is fixed, meaning that the user is not free to select favorable observation functionals . As such, we are not concerned with the minimal number of observations needed to achieve a prescribed accuracy. In other words, we do not investigate the complexity of the problem. But replacing by or is actually akin to a strategy which is popular in complexity studies: substituting a refined model set for a model set that was initially too rich. This strategy can transform an intractable multivariate problem into a tractable one, see [12] for a typical example. For our refined approximability sets, complexity inquiries would be interesting to pursue, especially in order to uncover more qualitative statements on the benefits (or lack thereof) of overparametrization. This would require, for a start, to precisely estimate the worst-case error (4) minimized over , and in particular its dependence on the number of variables of multivariate functions . However, this is beyond the scope of this article, whose focus is placed on the computability of the optimal recovery maps.
Section snippets
Bounded approximability set of the second type
We concentrate in this section on the bounded approximability set of the second type, i.e., on We shall first describe optimal recovery maps before showing how they can be computed in practice.
Bounded approximability set of the first type
We concentrate in this section on the bounded approximability set of the first type, i.e., on Once again, we shall first describe optimal recovery maps before showing how they can be computed in practice.
References (13)
- et al.
When are quasi-Monte Carlo algorithms efficient for high dimensional integrals?
J. Complexity
(1998) - et al.
Robust Optimization
(2009) - et al.
Data assimilation in reduced modeling
SIAM/ASA J. Uncertain. Quantif.
(2017) - et al.
Convex Optimization
(2004) - et al.
Computing a quantity of interest from observational data
Constr. Approx.
(2019) - M. Ettehad, S. Foucart, Instances of computational optimal recovery: dealing with observation errors....
Cited by (6)
Full recovery from point values: an optimal algorithm for Chebyshev approximability prior
2023, Advances in Computational MathematicsLearning from non-random data in Hilbert spaces: an optimal recovery perspective
2022, Sampling Theory, Signal Processing, and Data AnalysisInstances of Computational Optimal Recovery: Dealing with Observation Errors
2021, SIAM-ASA Journal on Uncertainty Quantification
- ☆
Communicated by E. Novak.
- 1
S.F. is partially supported by NSF, United States of America grants DMS-1622134 and DMS-1664803, and also acknowledges the NSF, United States of America grant CCF-1934904.