Abstract
A methodology for using random sketching in the context of model order reduction for high-dimensional parameter-dependent systems of equations was introduced in Balabanov and Nouy (Part I; Advances in Computational Mathematics 45:2969–3019, 2019). Following this framework, we here construct a reduced model from a small, efficiently computable random object called a sketch of a reduced model, using minimal residual methods. We introduce a sketched version of the minimal residual based projection as well as a novel nonlinear approximation method, where for each parameter value, the solution is approximated by minimal residual projection onto a subspace spanned by several vectors picked (online) from a dictionary of candidate basis vectors. It is shown that random sketching technique can improve not only efficiency but also numerical stability. A rigorous analysis of the conditions on the random sketch required to obtain a given accuracy is presented. These conditions may be ensured a priori with high probability by considering for the sketching matrix an oblivious embedding of sufficiently large size. Furthermore, a simple and reliable procedure for a posteriori verification of the quality of the sketch is provided. This approach can be used for certification of the approximation as well as for adaptive selection of the size of the random sketching matrix.
Similar content being viewed by others
References
Amsallem, D., Farhat, C., Zahr, M.: On the robustness of residual minimization for constructing pod-based reduced-order cfd models 21st AIAA Computational Fluid Dynamics Conference (2013)
Amsallem, D., Haasdonk, B.: Pebl-rom: Projection-error based local reduced-order models. Advanced Modeling and Simulation in Engineering Sciences 3(1), 6 (2016)
Balabanov, O., Nouy, A.: Randomized linear algebra for model reduction. Part I: Galerkin methods and error estimation. Advances in Computational Mathematics 45, 2969–3019 (2019)
Barrault, M., Maday, Y., Nguyen, N.C., Patera, A.T.: An empirical interpolation method:, application to efficient reduced-basis discretization of partial differential equations. Comptes Rendus Mathématique 339(9), 667–672 (2004)
Benner, P., Cohen, A., Ohlberger, M., Willcox, K. (eds.): Model reduction and approximation: theory and algorithms. SIAM, Philadelphia (2017)
Bertoldi, K., Vitelli, V., Christensen, J., van Hecke, M.: Flexible mechanical metamaterials. Nature Reviews Materials 2(11), 17066 (2017)
Bistrian, D.A., Navon, I.M.: Randomized dynamic mode decomposition for nonintrusive reduced order modelling. Int. J. Numer. Methods Eng. 112 (1), 3–25 (2017)
Brunton, S.L., Proctor, J.L., Tu, J.H., Kutz, J.N.: Compressed sensing and dynamic mode decomposition. Journal of Computational Dynamics 2, 165 (2015)
Buhr, A., Smetana, K.: Randomized local model order reduction. SIAM J. Sci. Comput. 40, 2120–2151 (2018)
Bui-Thanh, T., Willcox, K., Ghattas, O.: Model reduction for large-scale systems with high-dimensional parametric input space. SIAM J. Sci. Comput. 30(6), 3270–3288 (2008)
Cao, Y., Petzold, L.: A posteriori error estimation and global error control for ordinary differential equations by the adjoint method. SIAM J. Sci. Comput. 26(2), 359–374 (2004)
Carlberg, K., Farhat, L., Cortial, J., Amsallem, D.: The GNAT method for nonlinear model reduction: effective implementation and application to computational fluid dynamics and turbulent flows. J. Comput. Phys. 242, 623–647 (2013)
Chen, H., Chan, C.T.: Acoustic cloaking and transformation acoustics. J. Phys. D: App. Phys. 43(11), 113001 (2010)
Chen, Y., Hesthaven, J.S., Maday, Y., Rodríguez, J.: A monotonic evaluation of lower bounds for inf-sup stability constants in the frame of reduced basis approximations. Comptes Rendus Mathematique 346(23), 1295–1300 (2008)
Chen, Y., Hesthaven, J.S., Maday, Y., Rodríguez, J.: Improved successive constraint method based a posteriori error estimate for reduced basis approximation of 2D Maxwell’s problem. ESAIM: Mathematical Modelling and Numerical Analysis 43(6), 1099–1116 (2009)
Cheng, Y., Yang, F., Xu, J.Y., Liu, X.J.: A multilayer structured acoustic cloak with homogeneous isotropic materials. Applied Physics Letters 92 (15), 151913 (2008)
DeVore, R.A.: Nonlinear approximation and its applications. In: Multiscale, Nonlinear and Adaptive Approximation, pages 169–201. Springer (2009)
Dihlmann, M., Kaulmann, S., Haasdonk, B.: Online reduced basis construction procedure for model reduction of parametrized evolution systems. IFAC Proceedings Volumes 45(2), 112–117 (2012)
Eftang, J.L., Knezevic, D.J., Patera, A.T.: An hp certified reduced basis method for parametrized parabolic partial differential equations. Mathematical and Computer Modelling of Dynamical Systems 17(4), 395–422 (2011)
Eftang, J.L., Patera, A.T., Rønquist, E.M.: An “hp” certified reduced basis method for parametrized elliptic partial differential equations. SIAM Journal on Scientific Computing 32(6), 3170–3200 (2010)
Erichson, N.B., Donovan, C.: Randomized low-rank dynamic mode decomposition for motion detection. Comput. Vis. Image Underst. 146, 40–50 (2016)
Erichson, N.B., Mathelin, L., Brunton, S.L., Kutz, J.N.: Randomized dynamic mode decomposition. SIAM Journal on Applied Dynamical Systems 18(4), 1867–1891 (2019)
Haasdonk, B.: Reduced basis methods for parametrized PDEs – A tutorial introduction for stationary and instationary problems. Model reduction and approximation: theory and algorithms 15, 65 (2017)
Halko, N., Martinsson, P.-G., Tropp, J.A.: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM review 53(2), 217–288 (2011)
Hochman, A., Villena, J.F., Polimeridis, A.G., Silveira, L.M., White, J.K., Daniel, L.: Reduced-order models for electromagnetic scattering problems. IEEE Transactions on Antennas and Propagation 62(6), 3150–3162 (2014)
Homescu, C., Petzold, L.R., Serban, R.: Error estimation for reduced-order models of dynamical systems. SIAM J. Numer. Anal. 43(4), 1693–1714 (2005)
Huynh, D.B.P., Rozza, G., Sen, S., Patera, A.T.: A successive constraint linear optimization method for lower bounds of parametric coercivity and inf–sup stability constants. Comptes Rendus Mathématique 345(8), 473–478 (2007)
Huynh, D.B.P., Knezevic, D.J., Chen, Y., Hesthaven, J.S., Patera, A.T.: A natural-norm Successive Constraint Method for inf-sup lower bounds. Computer Methods in Applied Mechanics and Engineering 199(29), 1963–1975 (2010)
Kadic, M., Bückmann, T., Schittny, R., Gumbsch, P., Wegener, M.: Pentamode metamaterials with independently tailored bulk modulus and mass density. Physical Review Applied 2(5), 054007 (2014)
Kaulmann, S., Haasdonk, B.: Online greedy reduced basis construction using dictionaries. In VI International Conference on Adaptive Modeling and Simulation (ADMOS 2013, 365–376 (2013)
Kramer, B., Grover, P., Boufounos, P., Nabi, S., Benosman, M.: Sparse sensing and dmd-based identification of flow regimes and bifurcations in complex flows. SIAM Journal on Applied Dynamical Systems 16(2), 1164–1196 (2017)
Le Magoarou, L., Gribonval, R.: Flexible multilayer sparse approximations of matrices and applications. IEEE Journal of Selected Topics in Signal Processing 10(4), 688–700 (2016)
Maday, Y., Nguyen, N.C., Patera, A.T., Pau, S.H.: A general multipurpose interpolation procedure: the magic points. Communications on Pure & Applied Analysis 8(1), 383 (2009)
Maday, Y., Stamm, B.: Locally adaptive greedy approximations for anisotropic parameter reduced basis spaces. SIAM J. Sci. Comput. 35(6), 2417–2441 (2013)
Mahoney, M.W., et al.: Randomized algorithms for matrices and data. Foundations and Trends® in Machine Learning 3(2), 123–224 (2011)
Rokhlin, V., Tygert, M.: A fast randomized algorithm for overdetermined linear least-squares regression. Proc. Natl. Acad. Sci. 105(36), 13212–13217 (2008)
Rozza, G., Huynh, D.B.P., Patera, A. T.: Reduced basis approximation and a posteriori error estimation for affinely parametrized elliptic coercive partial differential equations. Archives of Computational Methods in Engineering 15(3), 1 (2007)
Rubinstein, R., Zibulevsky, M., Elad, M.: Double sparsity: Learning sparse dictionaries for sparse signal approximation. IEEE Transactions on signal processing 58(3), 1553–1564 (2009)
Smetana, K., Zahm, O., Patera, A.T.: Randomized residual-based error estimators for parametrized equations. SIAM journal on scientific computing 41(2), 900–926 (2019)
Taddei, T.: An offline/online procedure for dual norm calculations of parameterized functionals: empirical quadrature and empirical test spaces. Advances in Computational Mathematics 45(5-6), 2429–2462 (2019)
Temlyakov, V.N.: Nonlinear Kolmogorov widths. Mathematical Notes 63(6), 785–795 (1998)
Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Transactions on information theory 53(12), 4655–4666 (2007)
Woodruff, D.P., et al.: Sketching as a tool for numerical linear algebra. Foundations and Trends® in Theoretical Computer Science 10(1–2), 1–157 (2014)
Zahm, O., Nouy, A.: Interpolation of inverse operators for preconditioning parameter-dependent equations. SIAM J. Sci. Comput. 38 (2), 1044–1074 (2016)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Gunnar J Martinsson
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1. Post-processing the reduced model’s solution
In the paper we presented a methodology for efficient computation of an approximate solution ur(μ), or to be more precise, its coordinates in a certain basis, which can be the classical reduced basis for a fixed approximation space, or the dictionary vectors for dictionary-based approximation presented in Section ??. The approximate solution ur(μ), however, is usually not what one should consider as the output. In fact, the amount of allowed online computations is highly limited and should be independent of the dimension of the full order model. Therefore outputting \(\mathcal {O}(n)\) bytes of data as ur(μ) should be avoided when u(μ) is not the quantity of interest.
Further, we shall consider an approximation with a single subspace Ur noting that the presented approach can also be used for post-processing the dictionary-based approximation from Section ?? (by taking Ur as the subspace spanned by the dictionary vectors). Let Ur be a matrix whose column vectors form a basis for Ur and let ar(μ) be the coordinates of ur(μ) in this basis. A general quantity of interest s(μ) := l(u(μ); μ) can be approximated by sr(μ) := l(ur(μ); μ). Further, let us assume a linear case where l(u(μ); μ) := 〈l(μ),u(μ)〉 with \(\mathbf {l}(\mu ) \in U^{\prime }\) being the extractor of the quantity of interest. Then
where \(\mathbf {l}_{r}(\mu ):=\mathbf {U}_{r}^{\mathrm {H}} \mathbf {l}(\mu )\).
Remark A.1
In general, our approach can be used for estimating an inner product between arbitrary parameter-dependent vectors. The possible applications include efficient estimation of the primal-dual correction and an extension to quadratic quantities of interest. In particular, the estimation of the primal-dual correction can be obtained by replacing l(μ) by r(ur(μ); μ) and ur(μ) by vr(μ) in (A1.1), where vr(μ) ∈ U is a reduced basis (or dictionary-based) approximate solution to the adjoint problem. A quadratic output quantity of interest has the form l(ur(μ); μ) := 〈L(μ)ur(μ) + l(μ),ur(μ)〉, where \(\mathbf {L}(\mu ):U \to U^{\prime }\) and \(\mathbf {l}(\mu ) \in U^{\prime }\). Such l(ur(μ); μ) can be readily derived from (A1.1) by replacing l(μ) with L(μ)ur(μ) + l(μ).
The affine factors of lr(μ) should be first precomputed in the offline stage and then used for online evaluation of lr(μ) for each parameter value with a computational cost independent of the dimension of the original problem. The offline computations required for evaluating the affine factors of lr(μ), however, can still be too expensive or even unfeasible to perform. Such a scenario may arise when using a high-dimensional approximation space (or a dictionary), when the extractor l(μ) has many (possibly expensive to maintain) affine terms, or when working in an extreme computational environment, e.g., with data streamed or distributed among multiple workstations. In addition, evaluating lr(μ) from the affine expansion as well as evaluating \(\mathbf {l}_{r}(\mu )^{\mathrm {H}} \mathbf {a}_{r}(\mu )\) itself can be subject to round-off errors (especially when Ur is ill-conditioned and may not be orthogonalized). Further, we shall provide a (probabilistic) way for estimating sr(μ) with a reduced computational cost and better numerical stability. As the core we take the idea from [3, Section 4.3] proposed as a workaround to expensive offline computations for the evaluation of the primal-dual correction.
Remark A.2
The spaces U and \(U^{\prime }\) are equipped with inner products 〈⋅,⋅〉U and \(\langle \cdot , \cdot \rangle _{U^{\prime }}\) (defined by matrix RU), which are used for controlling the accuracy of the approximate solution ur(μ). In general, RU is chosen according to both the operator A(μ) and the extractor l(μ) of the quantity of interest. The goal of this section, however, is only the estimation of the quantity of interest from the given ur(μ). Consequently, for many problems it can be more pertinent to use here a different RU than the one employed for obtaining and characterizing ur(μ). The choice for RU should be done according to l(μ) (independently of A(μ)). For instance, for discretized parametric PDEs, if l(μ) represents an integral of the solution field over the spatial domain then it is natural to choose 〈⋅,⋅〉U corresponding to the L2 inner product. Moreover, 〈⋅,⋅〉U is required to be an inner product only on a certain subspace of interest, which means that RU may be a positive semi-definite matrix. This consideration can be particularly helpful when the quantity of interest depends only on the restriction of the solution field to a certain subdomain. In such a case, 〈⋅,⋅〉U can be chosen to correspond with an inner product between restrictions of functions to this subdomain. The extension of random sketching for estimation of semi-inner products is straightforward (see Remark A.3).
Remark A.3
Let us outline the extension of the methodology to the case where 〈⋅,⋅〉U is not definite. Let us assume that 〈⋅,⋅〉U is an inner product on a subspace \(W \subseteq U\) of interest. Then, it follows that \(W^{\prime }:=\{ \mathbf {R}_{U} \mathbf {x} : \mathbf {x} \in W \}\) can be equipped with \(\langle \cdot , \cdot \rangle _{U^{\prime }}:= \langle \cdot , \mathbf {R}_{U}^{\dagger } \cdot \rangle \), where \(\mathbf {R}_{U}^{\dagger }\) is a pseudo-inverse of RU. Such products 〈⋅,⋅〉U and \(\langle \cdot , \cdot \rangle _{U^{\prime }}\) can be approximated by
This can be useful for the estimation of a (semi-)inner product between parameter-dependent vectors (see Remark A.2).
1.1 A1.1 Approximation of the quantity of interest
An efficiently computable and accurate estimation of sr(μ) can be obtained in two phases. In the first phase, the manifold \({\mathscr{M}}_{r} := \{ \mathbf {u}_{r}(\mu ) : \mu \in \mathcal {P} \}\) is (accurately enough) approximated with a subspace Wp := span(Wp) ⊂ U, which is spanned by an efficient to multiply (i.e., sparse or low-dimensional) matrix Wp. This matrix can be selected a priori or obtained depending on \({\mathscr{M}}_{r}\). In Section A we shall provide some strategies for choosing or computing the columns for Wp. The appropriate strategy should be selected depending on the particular problem and computational architecture. Further, the solution vector ur(μ) is approximated by its orthogonal projection wp(μ) := Wpcp(μ) on Wp. The coordinates cp(μ) can be obtained from ar(μ) by
where \(\mathbf {H}_{p} := [\mathbf {W}^{\mathrm {H}}_{p} \mathbf {R}_{U} \mathbf {W}_{p}]^{-1} \mathbf {W}^{\mathrm {H}}_{p} \mathbf {R}_{U} \mathbf {U}_{r}\). Note that since Wp is efficient to multiply by, the matrix Hp can be efficiently precomputed in the offline stage. We arrive at the following estimation of sr(μ):
where \(\mathbf {l}^{\star }_{r}(\mu )^{\mathrm {H}}:= \mathbf {l}(\mu )^{\mathrm {H}} \mathbf {W}_{p} \mathbf {H}_{p}\). Unlike lr(μ), the affine factors of \(\mathbf {l}^{\star }_{r}(\mu )\) can now be efficiently precomputed thanks to the structure of Wp.
In the second phase of the algorithm, the precision of (A1.4) is improved with a sketched (random) correction associated with an U → ℓ2 subspace embedding Θ:
In practice, \(s^{\star }_{r}(\mu )\) can be efficiently evaluated using the following relation:
where the affine terms of \({}_{\Delta }\mathbf {l}^{\star }_{r}(\mu )^{\mathrm {H}} := \mathbf {l}^{\mathbf {\Theta }}(\mu )^{\mathrm {H}} (\mathbf {U}^{\mathbf {\Theta }}_{r} - \mathbf {W}^{\mathbf {\Theta }}_{p} \mathbf {H}_{p})\) can be precomputed from the Θ-sketch of Ur, a sketched matrix \(\mathbf {W}^{\mathbf {\Theta }}_{p}:=\mathbf {\Theta } \mathbf {W}_{p}\) and the matrix Hp with a negligible computational cost.
Proposition A.4
If Θ is an (ε,δ, 1) oblivious U → ℓ2 subspace embedding,
holds for a fixed parameter \(\mu \in \mathcal {P}\) with probability at least 1 − 2δ.
Proof
Denote \(\mathbf {x} := \mathbf {R}^{-1}_{U} \mathbf {l} (\mu ) / \| \mathbf {l}(\mu ) \|_{U^{\prime }}\) and y := (ur(μ) −wp(μ))/∥ur(μ) −wp(μ)∥U. Let us consider \(\mathbb {K} = \mathbb {C}\), which also accounts for the real case, \(\mathbb {K} = \mathbb {R}\). Let
Observe that |ω| = 1 and \(\langle \mathbf {x} , \omega \mathbf {y} \rangle _{U} - \langle \mathbf {x} , \omega \mathbf {y} \rangle ^{\mathbf {\Theta }}_{U}\) is a real number.
By a union bound for the probability of success, Θ is an ε-embedding for span(x + ωy) and span(x − ωy) with probability at least 1 − 2δ. Then, using the parallelogram identity we obtain
We conclude that relation (A1.7) holds with probability at least 1 − 2δ. □
Proposition A.5
Let L ⊂ U denote a subspace containing \(\{\mathbf {R}^{-1}_{U} \mathbf {l}(\mu ): \mu \in \mathcal {P} \}\). If Θ is an ε-embedding for L + Ur + Wp, then (A1.7) holds for all \(\mu \in \mathcal {P}\).
Proof
We can use a similar proof as in Proposition A.4 with the fact that if Θ is an ε-embedding for L + Ur + Wp, then it satisfies the ε-embedding property for span(x + ωy) and span(x − ωy). □
It follows that the accuracy of \(s^{\star }_{r}(\mu )\) can be controlled through the quality of Wp for approximating \({\mathscr{M}}_{r}\), the quality of Θ as an U → ℓ2 ε-embedding, or both. Note that choosing Θ as a null matrix (i.e., an ε-embedding for U with ε = 1) leads to a single first-phase approximation (A1.4), while letting Wp := {0} corresponds to a single sketched (second-phase) approximation. Such particular choices for Θ or Wp can be pertinent when the subspace Wp is highly accurate so that there is practically no benefit to use a sketched correction or, the other way around, when the computational environment or the problem does not permit a sufficiently accurate approximation of \({\mathscr{M}}_{r}\) with Wp, therefore making the use of a non-zero wp(μ) unjustified.
Remark A.6
When interpreting random sketching as a Monte Carlo method for the estimation of the inner product 〈l(μ),ur(μ)〉, the proposed approach can be interpreted as a control variate method where wp(μ) plays the role of the control variate. A multileveled Monte Carlo method with different control variates should further improve the efficiency of post-processing.
1.2 A1.2 Construction of reduced subspaces
Further we address the problem of computing the basis vectors for Wp. In general, the strategy for obtaining Wp has to be chosen according to the problem’s structure and the constraints due to the computational environment.
A simple way, used in [3], is to choose Wp as the span of samples of u(μ) either chosen randomly or during the first few iterations of the reduced basis (or dictionary) generation with a greedy algorithm. Such Wp, however, may be too costly to operate with. Then we propose more sophisticated constructions of Wp.
Approximate Proper Orthogonal Decomposition
A subspace Wp can be obtained by an (approximate) POD of the reduced vectors evaluated on a training set \(\mathcal {P}_{\text {train}} \subseteq \mathcal {P}\). Here, randomized linear algebra can be again employed for improving efficiency. The computational cost of the proposed POD procedure shall mainly consist of the solution of \(m = \# \mathcal {P}_{\text {train}}\) reduced problems and the multiplication of Ur by \(p=\dim (W_{p}) \ll r\) small vectors. Unlike the classical POD, our methodology does not require computation or maintenance of the full solution’s samples and therefore allows large training sets.
Let \(L_{m} = \{ \mathbf {a}_{r} (\mu ^{i})\}^{m}_{i=1}\) be a training sample of the coordinates of ur(μ) in a basis Ur. We look for a POD subspace Wr associated with the snapshot matrix
where Lm is a matrix whose columns are the elements from Lm.
An accurate estimation of POD can be efficiently computed via the sketched method of snapshots introduced in [3, Section 5.2]. More specifically, a quasi-optimal (with high probability) POD basis can be calculated as
where
with \(\mathbf {t}_{1}, \dots , \mathbf {t}_{p}\) being the p dominant singular vectors of \(\mathbf {U}^{\mathbf {\Theta }}_{r} \mathbf {L}_{m}\). Note that the matrix \(\mathbf {T}^{*}_{p}\) can be efficiently obtained with a computational cost independent of the dimension of the full order model. The dominant cost is the multiplication of Ur by \(\mathbf {T}^{*}_{p}\), which is also expected to be inexpensive since \(\mathbf {T}^{*}_{p}\) has a small number of columns. Guarantees for the quasi-optimality of Wp can be readily derived from [3, Theorem 5.5].
Sketched greedy algorithm
A greedy search over the training set \(\{ \mathbf {u}_{r}(\mu ): \mu \in \mathcal {P}_{\text {train}} \}\) of approximate solutions is another way to construct Wp. At the i th iteration, Wi is enriched with a vector ur(μi+ 1) with the largest distance to Wi over the training set. Note that in this case the resulting matrix Wp has the form (A1.8), where \(\mathbf {T}^{*}_{p} = [\mathbf {a}_{r}(\mu ^{1}), \dots , \mathbf {a}_{r}(\mu ^{p})]\). The efficiency of the greedy selection can be improved by employing random sketching technique. At each iteration, the distance to Wi can be measured with the sketched norm \(\| \cdot \|^{\mathbf {\Theta }}_{U}\), which can be computed from sketches \(\mathbf {\Theta } \mathbf {u}_{r}(\mu ) = \mathbf {U}^{\mathbf {\Theta }}_{r} \mathbf {a}_{r}(\mu )\) of the approximate solutions with no need to operate with large matrix Ur but only its sketch. This allows efficient computation of the quasi-optimal interpolation points \(\mu ^{1}, \dots , \mu ^{p}\) and the associated matrix \(\mathbf {T}^{*}_{p}\). Note that for numerical stability an orthogonalization of Wi with respect to \(\langle \cdot , \cdot \rangle ^{\mathbf {\Theta }}_{U}\) can be performed, that can be done by modifying \(\mathbf {T}^{*}_{i}\) so that \(\mathbf {U}^{\mathbf {\Theta }}_{r} \mathbf {T}^{*}_{i}\) is an orthogonal matrix. Such \(\mathbf {T}^{*}_{i}\) can be obtained with standard QR factorization. When \(\mathbf {T}^{*}_{p}\) has been computed, the matrix Wp can be calculated by multiplying Ur with \(\mathbf {T}^{*}_{p}\). The quasi-optimality of \(\mu ^{1}, \dots , \mu ^{p}\) and approximate orthogonality of Wp is guaranteed if Θ is an ε-embedding for all subspaces from the set \(\{ W_{p} + \text {span}(\mathbf {u}_{r}(\mu ^{i})) \}^{m}_{i=1}\). This property of Θ can be guaranteed a priori by considering \((\varepsilon , \binom {m}{p}^{-1} \delta , p+1)\) oblivious U → ℓ2 subspace embeddings, or certified a posteriori with the procedure explained in Section ??.
Coarse approximation
Let us notice that the online cost of evaluating \(s_{r}^{\star }(\mu )\) does not depend on the dimension p of Wp. Consequently, if Wp is spanned by structured (e.g., sparse) basis vectors then a rather high dimension is allowed (possibly larger than r).
For classical numerical methods for PDEs, the resolution of the mesh (or grid) is usually chosen to guarantee both an approximability of the solution manifold by the approximation space and the stability. For many problems the latter factor is dominant and one choses the mesh primary to it. This is a typical situation for wave problems, advection-diffusion-reaction problems and many others. For these problems, the resolution of the mesh can be much higher than needed for the estimation of the quantity of interest from the given solution field. In these cases, the quantity of interest can be efficiently yet accurately approximated using a coarse-grid interpolation of the solution.
Suppose that each vector u ∈ U represents a function \(u: {\Omega } \to \mathbb {K}\) in a finite-dimensional approximation space spanned by basis functions \(\{ \psi _{i}(x) \}^{n}_{i=1}\) associated with a fine mesh of Ω. The function u(x) can be approximated by a projection on a coarse-grid approximation space spanned by basis functions \(\{ \phi _{i}(x) \}^{p}_{i=1}\). For simplicity assume that each \(\phi _{i}(x) \in \text {span} \{ \psi _{j}(x) \}^{n}_{j=1}\). Then the i th basis vector for Wp can be obtained simply by evaluating the coordinates of ϕi(x) in the basis \(\{ \psi _{j}(x) \}^{n}_{j=1}\). Note that for the classical finite element approximation, each basis function has a local support and the resulting matrix Wp is sparse.
1.2.1 A1.2.1 Numerical validation
The proposed methodology was validated experimentally on an invisibility cloak benchmark from Section ??.
As in Section ?? in this experiment we used an approximation space Ur of dimension r = 150 constructed with a greedy algorithm (based on sketched minres projection). We used a training set of 50,000 uniform random samples in \(\mathcal {P}\), while the test set \(\mathcal {P}_{test} \subset \mathcal {P}\) was taken as 1000 uniform random samples in \(\mathcal {P}\). The experiment was performed for a fixed approximation ur(μ) obtained with sketched minres projection on Ur using Θ with 1000 rows. For such ur(μ), an approximate extraction of the quantity sr(μ) = l(ur(μ); μ) from ur(μ) (represented by coordinates in reduced basis) was considered.
The post-processing procedure was performed by choosing l(μ) and ur(μ) in (A1.1) as \(\mathbf {u}_{r}(\mu ) - \mathbf {u}^{in}(\mu )\). Furthermore, for better accuracy the solution space was here equipped with (semi-)inner product \(\langle \cdot , \cdot \rangle _{U} := \langle \cdot , \cdot \rangle _{L^{2}({\Omega }_{1})}\) that is different from the inner product (??) considered for ensuring quasi-optimality of the minres projection and error estimation (see Remark A.2). For such choices of l(μ), ur(μ) and 〈⋅,⋅〉U, we employed a greedy search with error indicator \(\min \limits _{\mathbf {w} \in W_{i}} \|\mathbf {u}_{r}(\mu ) - \mathbf {w} \|^{\mathbf {\Theta }}_{U}\) over training set of 50,000 uniform samples in \(\mathcal {P}\) to find Wp. Then sr(μ) was efficiently approximated by \(s^{\star }_{r}(\mu )\) given in (A1.6). In this experiment, the error is characterized by \(e^{\mathrm {s}}_{\mathcal {P}}=\max \limits _{\mu \in \mathcal {P}_{\text {test}}} |s(\mu )-\tilde {s}_{r}(\mu )| / A_{s}\), where \(\tilde {s}_{r}(\mu ) = s_{r}(\mu )\) or \(s^{\star }_{r}(\mu )\). The statistical properties of \(e^{\mathrm {s}}_{\mathcal {P}}\) for each value of k and \(\dim (W_{p})\) were obtained with 20 realizations of Θ. Figure 10a exhibits the dependence of \(e^{\mathrm {s}}_{\mathcal {P}}\) on the size of Θ with Wp of dimension \(\dim {(W_{p})} =15\). Furthermore, in Fig. 10b we provide the maximum value \(e^{\mathrm {s}}_{\mathcal {P}}\) from the computed samples for different sizes of Θ and Wp. The accuracy of \(s^{\star }_{r}(\mu )\) can be controlled by the quality of Wp for the approximation of ur(μ) and the quality of \(\langle \cdot , \cdot \rangle ^{\mathbf {\Theta }}_{U}\) for the approximation of 〈⋅,⋅〉U. When Wp approximates well ur(μ), one can use a random correction with Θ of rather small size, while in the alternative scenario the usage of a large random sketch is required. In this experiment we see that the quality of the output is nearly preserved with high probability when using Wp of dimension \(\dim {(W_{p})} =20\) and a sketch of size k = 1000, or Wp of dimension \(\dim {(W_{p})} =15\) and a sketch of size k = 10,000. For less accurate Wp, with \(\dim {(W_{p})} \leq 10\), the preservation of the quality of the output requires larger sketches of sizes k ≥ 30,000. For optimizing efficiency the dimension for Wp and the size for Θ should be picked depending on the dimensions r and n of Ur and U, respectively, and the particular computational architecture. The increase of the considered dimension of Wp entails storage and operation with more high-dimensional vectors, while the increase of the sketch entails higher computational cost associated with storage and operation with the sketched matrix \(\mathbf {U}_{r}^{\mathbf {\Theta }} = \mathbf {\Theta } \mathbf {U}_{r}\). Let us finally note that for this benchmark the approximate extraction of the quantity of interest with our approach using \(\dim {(W_{p})} =15\) and a sketch of size k = 10,000, required in about 10 times less amount of storage and complexity than the classical exact extraction.
Appendix 2. Proofs of propositions
Proof Proof of Proposition 3.1
The statement of the proposition follows directly from the definitions of the constants ζr(μ) and ιr(μ), that imply
□
Proof Proof of Proposition 3.2
The proof follows the one of Proposition 3.1. □
Proof Proof of Proposition 3.3
By the assumption on Θ, we have that
holds for all x ∈span{u(μ)} + Ur. Then (??) follows immediately. □
Proof Proof of Proposition 3.5
Let \(\mathbf {a} \in \mathbb {K}^{r}\) and x := Ura. Then
Since Θ is an ε-embedding for Ur, we have
The statement of the proposition follows immediately.
□
Proof Proof of Proposition 4.1
Take arbitrary τ > 0, and let \(\mathcal {D}^{(i)}_{K}\) be dictionaries with \(\# \mathcal {D}^{(i)}_{K} = K_{i}\), such that
The existence of \(\mathcal {D}^{(i)}_{K}\) follows directly from the definition (??) of the dictionary-based width. Define \( {\mathcal {D}}^{*}_{K} = \bigcup ^{l}_{i=1} \mathcal {D}^{(i)}_{K}\). Then the following relations hold:
Since the above relations hold for arbitrary positive τ, we conclude that
□
Proof Proof of Proposition 4.3
Let \(U^{*}_{r}(\mu ) := \arg \min \limits _{W_{r} \in {\mathscr{L}}_{r}(\mathcal {D}_{K}) } \| \mathbf {u}(\mu ) - \mathbf {P}_{W_{r}} \mathbf {u}(\mu )\|_{U}\). By definition of ur(μ) and constants ζr,K(μ) and ιr,K(μ),
which ends the proof. □
Proof Proof of Proposition 4.4
The proof exactly follows the one of Proposition 4.3 by replacing \(\| \cdot \|_{U^{\prime }}\) with \(\| \cdot \|^{\mathbf {\Theta }}_{U^{\prime }}\). □
Proof Proof of Proposition 4.5
We have that
holds for all x ∈span{u(μ)} + Wr with \(W_{r} \in {\mathscr{L}}_{r}(\mathcal {D}_{K})\). The statement of the proposition then follows directly from the definitions of \(\zeta _{r,K}(\mu ), \iota _{r,K}(\mu ), \zeta ^{\mathbf {\Theta }}_{r,K}(\mu )\) and \(\iota ^{\mathbf {\Theta }}_{r,K}(\mu )\). □
Proof Proof of Proposition 4.6
Let matrix UK have columns {wi : i ∈{1⋯K}} and matrix Ur(μ) have columns \(\{\mathbf {w}_{i}: i \in \mathcal {I}(\mu ) \}\), with \(\mathcal {I}(\mu ) \subseteq \{1, \cdots ,K\}\) being a subset of r indices. Let \(\mathbf {x} \in \mathbb {K}^{r}\) be an arbitrary vector. Define a sparse vector \(\mathbf {z}(\mu ) = \left (z_{i}(\mu ) \right )_{i \in \{1 {\cdots } K \}} \) with \(\left (z_{i}(\mu ) \right )_{i \in \mathcal {I}(\mu )} = \mathbf {x}\) and zeros elsewhere. Let w(μ) := Ur(μ)x = UKz(μ). Then
Similarly,
The statement of the proposition follows immediately. □
Proof Proof of Proposition 5.2
Using Proposition A.4 in Appendix A with l(μ) := RUx, ur(μ) := y, wp(μ) := 0, Θ := Θ∗, ε := ε∗ and δ := δ∗, we have
from which we deduce that
holds with probability at least 1 − 2δ∗. In addition,
and
The statement of the proposition can be now derived by combining (A2.2) to (A2.4) and using a union bound argument. □
Proof Proof of Proposition 5.3
Observe that
Let us make the following assumption:
For the alternative case the proof is similar.
Next, we show that \(\bar {\omega }\) is an upper bound for ω with probability at least 1 − δ∗. Define \(\mathbf {x}^{*}:= \arg \min \limits _{\mathbf {x} \in V / \{ \mathbf {0} \}, \| \mathbf {x} \|_{U}=1} \| \mathbf {x}\|^{\mathbf {\Theta }}_{U}\). By definition of Θ∗,
holds with probability at least 1 − δ∗. If (A2.5) is satisfied, we have
□
Proof Proof of Proposition 5.4
By definition of ω and the assumption on Θ∗, for all x ∈ V, it holds
The above relations and the definition (??) of \(\bar {\omega }\) yield
which ends the proof. □
Rights and permissions
About this article
Cite this article
Balabanov, O., Nouy, A. Randomized linear algebra for model reduction—part II: minimal residual methods and dictionary-based approximation. Adv Comput Math 47, 26 (2021). https://doi.org/10.1007/s10444-020-09836-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10444-020-09836-5