Learning and correcting non-Gaussian model errors

https://doi.org/10.1016/j.jcp.2021.110152Get rights and content

Highlights

  • A neural network (NN) approach for learning and correcting numerical model errors is proposed.

  • NN modeling error correction is demonstrated for direct and inverse problems.

  • Experimental and numerical corroboration supports the feasibility of NN model error compensation.

Abstract

All discretized numerical models contain modeling errors – this reality is amplified when reduced-order models are used. The ability to accurately approximate modeling errors informs statistics on model confidence and improves quantitative results from frameworks using numerical models in prediction, tomography, and signal processing. Further to this, the compensation of highly nonlinear and non-Gaussian modeling errors, arising in many ill-conditioned systems aiming to capture complex physics, is a historically difficult task. In this work, we address this challenge by proposing a neural network approach capable of accurately approximating and compensating for such modeling errors in augmented direct and inverse problems. The viability of the approach is demonstrated using simulated and experimental data arising from differing physical direct and inverse problems.

Introduction

Discretized numerical models, such as the spectral element and finite element method (FEM), are widely used to simulate physical systems that vary in space or space and time [1], [2]. Likely the most familiar use of discretized numerical methods is in solving partial differential equations (PDEs). The motivation for discretizing and solving PDEs numerically is broad but is largely centered on estimating solutions to problems with arbitrary geometry and boundary conditions that may otherwise be difficult to model analytically. Numerical solutions to PDEs also enrich understanding on the spatial-temporal evolution of complex physics, for example in applications of black hole dynamics [3], geophysical fluid dynamics [4], and elastoplasticity [5].

Of course, this flexibility comes at a cost – pervasive modeling errors. These errors, largely a consequence of simplifying assumptions and the discretization refinement itself [6], corrupt numerical solutions and are the source of frustration for many scientists and engineers. Adding to the complexity of this situation, users often need to weigh the fidelity and resolution of numerical solutions against the expectation of errors (which are, themselves, uncertain). To illustrate this phenomenon, consider a classical example where the FEM equipped with quadratic quadrilateral (Q8) elements is used to compute the displacement field of a fixed Timoshenko beam. To investigate the modeling errors, we can choose varying levels of discretization refinement (i.e. h refinement) and compare/benchmark to the analytical solution for the displacement field Δ as a function of the spatial coordinates – horizontal and vertical coordinates x and y, respectively. Using the analytical solution, the vertical displacements for a fictitious beam shown in Fig. 1(a) are given by Δ(x,y=0)=F6EI[x2(3Lx)+x(5ν+4)d24] [7], where F=2 MN is the end load, L=10 m is the beam length, E=200 MPa is the modulus of elasticity, ν=0.33 is the Poisson ratio, I is the moment of inertia, and d=5 m is the depth of the beam's rectangular cross section. We show the trial solutions to this problem in Fig. 1(b), where the accuracy of the finite element (FE) solutions are indicated by their deviations from the analytical solution and further quantified in 1(c) using a standard L2 error norm plot. The illustrative problem here typifies the expected error convergence with increasing h refinement – i.e. as the mesh size densifies, the FE errors approach zero asymptotically. This realization reinforces the fact that, because FE sizes cannot be infinitesimal, numerical modeling errors are unavoidable (although, for this simple example, error convergence to machine precision is possible).

Not surprisingly, significant research has been conducted to reduce model errors dating back to the inception of computing in science and engineering, especially by those working in inverse problems where modeling errors result in artifacts in medical, geophysical, and material imaging [8]. Modeling errors are also a serious source of consideration for engineers using FE solutions to design infrastructure. For example, in some engineering cases, 5 or 10 percent modeling error may be tolerable; however, engineering judgment and discretion often play key roles in such a design process. Conversely, when the uncertainty of modeling error is sufficiently high or cannot be reliably estimated, minimizing errors may be the only option. Appropriately, researchers in these fields have developed a number of regimes for understanding modeling errors due to, e.g., the well-known physically-unrealistic zero energy modes in low-order elements and mesh-size dependence [9].

General approaches to approximating modeling errors – ultimately leading to their subtraction from the discretized solution – remain, however, very challenging [10]. Without question, authors such as Kaipio, Arridge, and coauthors have made substantial progress towards approximating assumed Gaussian modeling errors [11], [12], [13], while benefiting from improved-accuracy model reduction. Meanwhile, other modeling error approximation approaches have been effective in linear situations [14]. In cases where modeling errors are not linear as a function of the PDE solution or the error statistics are unknown a priori, available and accessible methods for approximating numerical modeling errors are sparse. First advances have been made for linear inverse problems, to incorporate an implicit model correction [15] within a learned gradient descent scheme [16]. Explicit corrections, as we investigate here, have been recently analyzed in [17], where the authors show that under sufficient approximation accuracy, we can expect to obtain solutions close to what we obtain with a correct model. We will take this as motivation and shall take the step to more demanding nonlinear inverse problems and verify the proposed techniques to experimental data. In particular, the approach discussed here provides an explicit quantification of the model error and not only a correction.

Considering first the realization that PDEs are themselves often highly nonlinear or that their solutions are nonlinearly dependent on the input parameters, one may quickly surmise that a “one size fits all” approach to quantifying modeling errors is possibly intractable. However, delving more deeply into how PDEs are numerically solved, one observes that there are two broad schools of approaches: namely, physics-based models and physics-informed models. The latter employs, e.g., neural networks (NNs) to develop surrogate models which emulate physics-based models and generate PDE solutions that are valid within the space of the physics-based training data. Such approaches have been the source of significant research in the past few years [18], [19], [20], [21]; this interest largely stems from the speed and accuracy of simulating physics with well-trained NNs. Accordingly, the success of NNs in developing nonlinear maps from causalities to PDE solutions leads to the ambition that NNs can also be used to develop effective nonlinear maps from PDE solutions to errors in PDE solutions. Assuming the former can be actualized, the use of NNs developed via deep learning for approximating numerical modeling errors has the distinct advantage that error statistics are not required a priori. It is worth remarking that, from a cautionary standpoint, the accuracy of NN predictions is heavily dependent on the training data. Accordingly, significant care should be taken when generating training data and assessing the reliability of NN predictions of modeling error.

Section snippets

Approximating numerical modeling errors using neural networks

The aim of deep learning, and in particular the use of neural networks, is to develop a nonlinear mapping A between two parameter spaces [22]. In the context of this work, we aim to learn the functional relation between discretized numerical PDE solutions u and model errors ϵ such that ϵ=A(u). One advantage of using NNs is that there are numerous architectures and algorithms available for handling differing data structures characterized by u, for example convolutional networks in the case where

Modeling error compensated inverse problems

A number of inverse problems, particularly those in geophysical and medical imaging, suffer from modeling errors. The presence of such error often results in imaging artifacts and degradation of spatial resolution. Therefore, the inclusion of reliable modeling error approximators are of significant practical value. Generally, solutions to inverse problems with a parameterization θ are solved by iteratively minimizing a cost functional of the formΨ=||duA(θ)||2+R(θ) where d is a vector of

Network structuring

The purpose of tailoring a NN to a specific application, in the context of this work, is to ensure the network is sufficiently robust to (a) make reliable predictions using validation data independent from the training data and (b) also ensure the NN is able to make predictions within the space of the training data. In this sense, training data can be considered as prior information on the anticipated space of NN predictions. In this work, this is the space of modeling errors to be predicted

Application of NN-error correction to direct problems – linear elasticity

Numerical problems in elasticity are pervasive in science and engineering; for example, in biophysics [26] and multidisciplinary mechanical engineering [27]. Herein, we investigate modeling errors in the well-known two-dimensional linear elastic problem in a geometry Ω having a boundary ∂Ω. We define boundary tractions fˆ on Ω1, as part of the boundary. Following, the displacements on Ω2 are defined by uˆ. Neglecting body forces, the elastic problem is writtenσx=0,xΩf(x)σn¯fˆ,xΩ1u=uˆ,

Application of NN-error correction to inverse problems – EIT

In this section we test the efficacy of the NN model error approximation approach in the context of electrical impedance tomography (EIT). In EIT, the general aim is to reconstruct electrical conductivity γ from distributed voltage measurements V. Specifically, we consider a so-called self-sensing material with deformation-dependent electrical conductivity (i.e. the material is piezoresistive) subject to an applied stress. The goal of EIT, therefore, is to recover the change in γ due to this

Discussion and conclusion

The aim of this work was to propose a straightforward neural network-based approach capable of accurately approximating and compensating for numerical modeling errors in augmented direct and inverse problems. A strong motivator behind this effort was the general need for model error compensation in the ever-present situation where modeling errors are non-Gaussian. To these aims and motivations, we affirmatively demonstrated the viability of using neural networks for both approximating such

CRediT authorship contribution statement

Danny Smyl: Conceptualization, Investigation, Methodology, Software, Writing – original draft. Tyler N. Tallman: Data curation, Investigation, Methodology, Writing – review & editing. Jonathan A. Black: Conceptualization, Writing – review & editing. Andreas Hauptmann: Methodology, Writing – review & editing. Dong Liu: Methodology, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

DL was supported by National Natural Science Foundation of China (Grant No. 61871356). This work was partially supported by The Academy of Finland Project 336796 (Finnish Centre of Excellence in Inverse Modelling and Imaging, 2018–2025).

References (35)

  • V. Babovic et al.

    J. Mar. Syst.

    (2005)
  • C.E. Augarde et al.

    Finite Elem. Anal. Des.

    (2008)
  • J. Kaipio et al.

    J. Comput. Appl. Math.

    (2007)
  • O.C. Zienkiewicz

    Comput. Methods Appl. Mech. Eng.

    (2006)
  • D. Smyl et al.

    J. Comput. Phys.

    (2019)
  • M. Raissi et al.

    J. Comput. Phys.

    (2019)
  • M. Raissi et al.

    J. Comput. Phys.

    (2018)
  • H. Askes et al.

    Int. J. Solids Struct.

    (2011)
  • K.S. Surana et al.

    The Finite Element Method for Boundary Value Problems: Mathematics and Computations

    (2016)
  • D. Komatitsch et al.

    Science

    (2002)
  • J.G. Baker et al.

    Phys. Rev. D

    (2006)
  • A. Natale et al.

    Dyn. Stat. Clim. Syst.

    (2016)
  • A. Zervos et al.

    Int. J. Numer. Methods Eng.

    (2001)
  • A. Nissinen et al.

    IEEE Trans. Med. Imaging

    (2010)
  • A. Lehikoinen et al.

    Inverse Probl. Imaging

    (2007)
  • S. Arridge et al.

    Inverse Probl.

    (2006)
  • Y. Nievergelt

    SIAM Rev.

    (1994)
  • Cited by (18)

    • Fusing physics-inferred information from stochastic model with machine learning approaches for degradation prediction

      2023, Reliability Engineering and System Safety
      Citation Excerpt :

      Recently, physics-based methods are applied to provide a baseline for estimation according to physical mechanism knowledge, which leads to bias compared to actual data. The data-driven methods are used to learn this highly nonlinear model bias, and then, they are combined for better estimation [44–50]. In geoscience, some work has been conducted to use ML approaches to estimate and correct the errors from the physics-based methods.

    • EFFICIENT LEARNING METHODS FOR LARGE-SCALE OPTIMAL INVERSION DESIGN

      2024, Numerical Algebra, Control and Optimization
    View all citing articles on Scopus
    View full text