Elsevier

Journal of Complexity

Volume 62, February 2021, 101503
Journal of Complexity

Instances of computational optimal recovery: Refined approximability models

https://doi.org/10.1016/j.jco.2020.101503Get rights and content

Abstract

Models based on approximation capabilities have recently been studied in the context of Optimal Recovery. These models, however, are not compatible with overparametrization, since model- and data-consistent functions could then be unbounded. This drawback motivates the introduction of refined approximability models featuring an added boundedness condition. Thus, two new models are proposed in this article: one where the boundedness applies to the target functions (first type) and one where the boundedness applies to the approximants (second type). For both types of models, optimal maps for the recovery of linear functionals are first described on an abstract level before their efficient constructions are addressed. By exploiting techniques from semidefinite programming, these constructions are explicitly carried out on a common example involving polynomial subspaces of C[1,1].

Introduction

The objective of this article is to uncover practical methods for the optimal recovery of functions available through observational data when the underlying models based on approximability allow for overparametrization. To clarify this objective and its various challenges, we start with some background on traditional Optimal Recovery. Typically, an unknown function f defined on a domain D is observed through point evaluations y1=f(x1),,ym=f(xm) at distinct points x1,,xmD. More generally, an unknown object f, simply considered as an element of a normed space X, is observed through yi=i(f),i[1:m],where 1,,m are linear functionals defined on X. We assume here that these data are perfectly accurate — we refer to the companion article [5] for the incorporation of observation error. The data is summarized as y=L(f), where the linear map L:XRm is called observation operator. Based on the knowledge of yRm, the task is then to recover a quantity of interest Q(f), where throughout this article Q:XR is assumed to be a linear functional. The recovery procedure can be viewed as a map R from Rm to R, with no concern for its practicability at this point.

Besides the observational data (which is also called a posteriori information), there is some a priori information coming from an educated belief about the properties of realistic f’s. It translates into the assumption that f belongs to a model set KX. The choice of this model set is of course critical. When the f’s indeed represent functions, it is traditionally taken as the unit ball with respect to some norm that characterizes smoothness. More recently, motivated by parametric partial differential equations, a model based on approximation capabilities has been proposed in [2]. Namely, given a linear subspace V of X and a threshold ε>0, it is defined as K=KV,ε{fX:distX(f,V)ε}.This model set is also implicit in many numerical procedures and in machine learning.

Whatever the selected model set, the performance of the recovery procedure R:RmR is measured in a worst-case setting via the (global) error of R over K, i.e., eK,Q(L,R)supfK|Q(f)R(L(f))|.Obviously, one is interested in optimal recovery maps Ropt:RmR minimizing this worst-case error, i.e., such that eK,Q(L,Ropt)=infR:RmReK,Q(L,R).This infimum is called the intrinsic error of the observation map L (for Q over K). It is known, at least since Smolyak’s doctoral dissertation [13], that there is a linear functional among the optimal recovery maps as soon as the set K is symmetric and convex, see e.g. [10, Theorem 4.7] for a proof. The practicality of such a linear optimal recovery map is not automatic, though. For the approximability set (2), Theorem 3.1 of [4] revealed that such a linear optimal recovery map takes the form Ropt:yRmi=1maioptyi, where aoptRm is a solution to minimizeaRmQi=1maiiXsubject to i=1maii(v)=Q(v)for all vV,an optimization problem that can be solved for X=C(D) in exact form when the observation functionals are point evaluations (see [4]) and in approximate form when they are arbitrary linear functionals (see [5] or Section 3.2).

The approximability set (2), however, presents some important restrictions. Suppose indeed that there is some nonzero vker(L)V. Then, for a given f0K observed through y=L(f0)Rm, any ftf0+tv, tR, is both model-consistent (i.e., ftK) and data-consistent (i.e., L(ft)=y), so that the local error at y of any recovery map R:RmR satisfies eK,Qloc(L,R(y))supfKL(f)=y|Q(f)R(y)|suptR|Q(ft)R(y)|=suptR|(Q(f0)R(y))+tQ(v)|,which is infinite whenever Q(v)0. Thus, for the optimal recovery problem to make sense under the approximability model (2), independently of the quantity of interest Q, one must assume that ker(L)V={0}. By a dimension argument, this imposes ndim(V)m.In other words, we must place ourselves in an underparametrized regime for which the number n of parameters describing the model does not exceed the number m of data. This contrasts with many current studies, especially in the field of Deep Learning, which emphasize the advantages of overparametrization. In order to incorporate overparametrization in the optimal recovery problem under consideration, we must then restrict the magnitude of model- and data-consistent elements. A glaring strategy consists in altering the approximability set (2). We do so in two different ways, namely by considering a bounded approximability set of the first type, i.e., K=KV,ε,κI{fX:distX(f,V)ε and fXκ},and a bounded approximability set of the second type, i.e., K=KV,ε,κII{fX:vV with fvXε and vXκ}.We will start by analyzing the second type of bounded approximability sets in Section 2 by formally describing the optimal recovery maps before revealing on a familiar example how the associated minimization problem is tackled in practice. The main ingredient in essence belongs to the sum-of-squares techniques from semidefinite programming. Next, we will analyze the first type of bounded approximability sets in Section 3. We will even formally describe optimal recovery maps over more general model sets consisting of intersections of approximability sets. On the prior example, we will again reveal how the associated minimization problem is tackled in practice. This time, the main ingredient in essence belongs to the moment techniques from semidefinite programming. In view of this article’s emphasis on computability issues, all of the theoretical constructions are illustrated in a reproducible matlab file downloadable from the author’s webpage.

Before delving into technicalities, we stress that this article considers a scenario where L is fixed, meaning that the user is not free to select favorable observation functionals 1,,m. As such, we are not concerned with the minimal number m of observations needed to achieve a prescribed accuracy. In other words, we do not investigate the complexity of the problem. But replacing KV,ε by KV,ε,κI or KV,ε,κII is actually akin to a strategy which is popular in complexity studies: substituting a refined model set for a model set that was initially too rich. This strategy can transform an intractable multivariate problem into a tractable one, see [12] for a typical example. For our refined approximability sets, complexity inquiries would be interesting to pursue, especially in order to uncover more qualitative statements on the benefits (or lack thereof) of overparametrization. This would require, for a start, to precisely estimate the worst-case error (4) minimized over L, and in particular its dependence on the number of variables of multivariate functions fX. However, this is beyond the scope of this article, whose focus is placed on the computability of the optimal recovery maps.

Section snippets

Bounded approximability set of the second type

We concentrate in this section on the bounded approximability set of the second type, i.e., on K={fX:vV with fvXε and vXκ}.We shall first describe optimal recovery maps before showing how they can be computed in practice.

Bounded approximability set of the first type

We concentrate in this section on the bounded approximability set of the first type, i.e., on K={fX:distX(f,V)ε and fXκ}.Once again, we shall first describe optimal recovery maps before showing how they can be computed in practice.

References (13)

  • SloanI.H. et al.

    When are quasi-Monte Carlo algorithms efficient for high dimensional integrals?

    J. Complexity

    (1998)
  • Ben-TalA. et al.

    Robust Optimization

    (2009)
  • BinevP. et al.

    Data assimilation in reduced modeling

    SIAM/ASA J. Uncertain. Quantif.

    (2017)
  • BoydS. et al.

    Convex Optimization

    (2004)
  • DeVoreR. et al.

    Computing a quantity of interest from observational data

    Constr. Approx.

    (2019)
  • M. Ettehad, S. Foucart, Instances of computational optimal recovery: dealing with observation errors....
There are more references available in the full text version of this article.

Cited by (6)

Communicated by E. Novak.

1

S.F. is partially supported by NSF, United States of America grants DMS-1622134 and DMS-1664803, and also acknowledges the NSF, United States of America grant CCF-1934904.

View full text