Data-driven and active learning of variance-based sensitivity indices with Bayesian probabilistic integration
Introduction
Nowadays, owing to the rapid development of computation power, scientific computation based on computer simulators (e.g., finite element models) has been widely utilized in both academic research and engineering practice for predicting the behavior of complex systems or structures and aiding the design of new products. However, due to the uncertainties of various sources, the researchers and practitioners have found it difficult to achieve accurate and robust predictions with the deterministic simulators, and performing uncertainty quantification to properly incorporate those uncertainties in the model predictions has been a common trend in scientific computing [1], [2], and especially, in structural dynamics. As an important sub-task of uncertainty quantification, Sensitivity Analysis (SA) plays an important role in model developments and refinement as it informs the main sources of model prediction uncertainties [3], [4], [5]. This information is extremely useful for directing the future data collection (with the target of effectively reducing the model prediction uncertainty), and for specifying the subset of influential model parameters to be calibrated in finite element (FE) model updating [6].
Specifically, SA aims at attributing the uncertainty present in the model output to the input variables, and in this way to measure the contribution of each input variable to the uncertainty of model outputs [7]. Three groups of SA methods have been developed, i.e., local SA, regional SA, and Global SA (GSA), one can refer to Refs. [4], [5] for comprehensive reviews and comparisons of these methods. The local method measures the sensitivity of each input variable using the local partial derivatives, and it is widely used in the area of structural reliability for measuring the effects of the distribution parameters of input variables on the failure probability [8], [9]. The regional SA aims at quantifying the effects/contributions of the subregions of the distribution support of each input variables to the uncertainty of model outputs, and it can be especially useful for reduction of epistemic uncertainty [10]. The GSA indices are usually defined as the expected change of the statistical features (e.g., variance and density function) of model response when the input variables are fixed over their full supports, thus summarize the overall contribution of the uncertainty present in the input variables to those of the model outputs.
Among the above three groups of methods, the GSA has received the greatest attention during the past few decades, and a plenty of GSA techniques/indices have been developed for different purposes. The screening methods have been developed for screening the non-influential variables in moderate to high dimensional problems [11], [12]. The variance-based sensitivity indices [13], [14], [15], rooted in the Random sampling-high dimensional model representation (RS-HDMR) [16], aim at measuring the relative importance of the input variables by attributing the model response variance to each input variable and their interactions. Considering the setting of uncertainty reduction, a modified versions of the variance-based sensitivity indices, called W-indices, has also been developed for quantifying the effects of reducing the input uncertainty on that of model output [17]. Given that the variance is not sufficient for characterizing the uncertainty, the moment-independent sensitivity indices have also been devised for investigating the effect of each input variable on the full probability distribution of the model response [18], [19], [20]. The derivative-based sensitivity indices have also been established to realize variable screening with lower computational cost than the variance-based ones [21], [22], [23]. The global reliability sensitivity indices have been developed in the area of structural reliability, based on the variance-based indices, for measuring the contribution of input variables to the failure probability of structures [24], [25], [26]. Despite the extensive GSA indices that have been developed, the variance-based ones continue to receive the greatest concerns of both researchers and practitioners owing to the elegant mathematical interpretations for both independent and dependent variables, as well as their ability of capturing different types of effects [7], [27], [28]. Developing efficient and robust algorithms for estimating variance-based indices is then one of the most relevant challenges for performing the GSA analysis.
The past few decades have witnessed a rapid development of numerical algorithms for variance-based sensitivity indices, and one can refer to Ref. [29] for a comprehensive review on these related developments. Generally, these methods can be divided into three classes, i.e., Fourier amplitude sensitivity test (FAST), (quasi-) Monte Carlo simulation (MCS), and surrogate models. The FAST method, developed in the area of computational chemistry [30], estimates the partial variance terms involved in the variance-based sensitivity indices based on periodic sampling and Fourier transformation, and it has been widely studied and substantially improved since its development (see e.g., Refs. [31], [32], [33], [34]). The MCS method involves first formulating the partial variance terms with multi-dimensional integrals, and then utilizing MCS, driven by simple random sampling or Latin Hypercube Sampling (LHS) [35] or Sobol’s low-discrepancy sequence [36], to estimate these integrals. Following this scheme, a multitude of MCS estimators have been developed (see e.g., Refs. [37], [38], [39], [40]). The surrogate models, such as state dependent regression [41], polynomial chaos expansion [42], support vector regression [43] and Kriging, also called Gaussian Process Regression (GPR) [44], [45], [46], [47], have also been investigated for estimating the sensitivity indices. In terms of reliability of estimation, MCS is the most competitive scheme as confidence intervals can be computed for the sensitivity indices from the MCS estimators, but it also suffers from the large number of required simulator calls, which make it not applicable to computationally expensive simulators.
In recent years, Bayesian numerical analysis [48] with its different variants, such as Bayesian probabilistic optimization [49], Bayesian Probabilistic Integration (BPI) [50], [51], and Bayesian probabilistic Partial Differential Equation (PDE) solution [52], has emerged as a cutting-edge method in scientific computation. The aim of this work is therefore to extend the BPI methods for inferring the variance-based sensitivity indices from data and computer simulators. This topic has also been investigated in Ref.[46] in a full Bayesian scheme and in Ref.[53] with the so-called Bayesian MCS scheme, but in both papers, only the posterior mean and the main effect indices are investigated. In this work, both the posterior means and posterior variances will be first investigated for both the main and total variance-based indices based on BPI, following which, a data-driven BPI approach and an adaptive BPI approach will be developed for efficiently estimating the sensitivity indices. To achieve this goal, two principle lemmas are first developed for realizing the Bayesian inference, and then the posterior means and variances are both analytically derived for the sensitivity indices, where the posterior variances summarize the discretization errors for estimating these sensitivity indices. These analytical results form the basis of the data-driven BPI approach, with which the posterior features of the sensitivity indices can be inferred from any supervised learning data. To further improve the efficiency of the algorithm for computationally expensive simulators, an adaptive experiment design strategy is ultimately introduced. The effectiveness of the proposed methods are demonstrated by numerical examples, and their applicability to real-world engineering problems as well as their engineering significance are illustrated by three engineering benchmarks with FE simulators.
The rest of this paper is organized as follows. Section 2 briefly reviews the variance-based sensitivity indices and the BPI approach, followed by the core developments in Section 3, which includes the Bayesian inference of the sensitivity indices and the data-driven BPI. The adaptive BPI approach is then developed in Section 4, followed by the numerical and engineering test examples in Section 5. Section 6 closes the paper with conclusions.
Section snippets
Brief review of related topics
Before the introduction of the main developments, it is helpful to briefly review two important topics to be studied/utilized in this article, i.e., the variance-based sensitivity indices and the BPI. The expectation and variance operators utilized in this paper are declared in Table 1 for avoiding confusion.
Bayesian inference of sensitivity indices
In the previous section, the details of the BPI approach for estimating have been reviewed, and it is concluded that, given the GPR representation of the model function , the posterior distribution of is also Gaussian. Indeed, the induced probabilistic models for any orders of HDMR components (e.g., and ) are Gaussian as well [46], [58]. This provides a basis to infer the posterior features of the first-order partial variances , the total partial variance , and the total
Adaptive experiment design
Until now, we have generated the analytical expressions of the posterior means and posterior variances for all the (partial) variance terms following the Bayesian inference scheme based on the training data . Based on these results, a data-driven method is established for estimating the variance-based sensitivity indices. However, in real-world applications, the sensitivity analysis may also be implemented for computer simulators such as finite element models, which makes it possible to design
An illustrative example
Considering a two-dimensional model with g-function formulated as:where and are independent standard normal random variables. This is a highly nonlinear model with large interaction effects, and the variance-based sensitivity indices can be analytically derived to provide comparison.
For implementing the adaptive BPI, the stopping criteria is set to be , and the algorithm stops only
Conclusions and discussions
The estimation of the variance-based sensitivity indices is regarded as an statistical inference problem in this work, and based on a set of supervised training data, the posterior features (including means and variances) for all the (partial) variance terms involved in the sensitivity indices are analytically derived following two newly developed first principles and the rationale of BPI. Although the posterior distributions of these (partial) variance terms are no longer Gaussian, these
CRediT authorship contribution statement
Jingwen Song: Methodology, Software, Validation, Visualization, Writing - original draft. Pengfei Wei: Conceptualization, Methodology, Investigation, Writing - review & editing, Funding acquisition. Marcos A. Valdebenito: Validation, Resources, Writing - review & editing. Matthias Faes: Validation, Resources, Writing - review & editing. Michael Beer: Supervision, Project administration.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work is supported by the National Natural Science Foundation of China under Grant No. 51905430, the Sino-German Mobility Program under Grant No. M-0175, the ANID (Agency for Research and Development, Chile) under its program FONDECYT, Grant No. 1180271, and the Research Foundation Flanders (FWO) under Grant No. 12P3519N. The first author is supported by the program of China Scholarships Council (CSC). The second to forth authors are all supported by the Alexander von Humboldt Foundation of
References (66)
- et al.
Variable importance analysis: a comprehensive review
Reliability Engineering & System Safety
(2015) - et al.
Sensitivity analysis: a review of recent advances
European Journal of Operational Research
(2016) - et al.
Reliability-based sensitivity estimators of rare event probability in the presence of distribution parameter uncertainty
Reliability Engineering & System Safety
(2018) - et al.
Sensitivity estimation of failure probability applying line sampling
Reliability Engineering & System Safety
(2018) - et al.
Regional and parametric sensitivity analysis of Sobol’ indices
Reliability Engineering & System Safety
(2015) - et al.
An effective screening design for sensitivity analysis of large models
Environmental Modelling & Software
(2007) - et al.
Non-parametric statistics in sensitivity analysis for model output: a comparison of selected techniques
Reliability Engineering & System Safety
(1990) - et al.
Importance measures in global sensitivity analysis of nonlinear models
Reliability Engineering & System Safety
(1996) - et al.
A new variance-based global sensitivity analysis technique
Computer Physics Communications
(2013) A new uncertainty importance measure
Reliability Engineering & System Safety
(2007)
Global sensitivity measures from given data
European Journal of Operational Research
Derivative based global sensitivity measures
Procedia-Social and Behavioral Sciences
Derivative-based global sensitivity measures: general links with sobol’ indices and numerical tests
Mathematics and Computers in Simulation
Efficient sampling methods for global reliability sensitivity analysis
Computer Physics Communications
Global reliability sensitivity estimation based on failure samples
Structural Safety
Global sensitivity analysis of reliability of structural bridge system
Engineering Structures
A new interpretation and validation of variance based importance measures for models with correlated inputs
Computer Physics Communications
Variance-based sensitivity indices for models with dependent inputs
Reliability Engineering & System Safety
An alternative way to compute fourier amplitude sensitivity test (FAST)
Computational Statistics & Data Analysis
Understanding and comparisons of different sampling approaches for the fourier amplitudes sensitivity test (FAST)
Computational Statistics & Data Analysis
Random balance designs for the estimation of first order global sensitivity indices
Reliability Engineering & System Safety
Extension of the RBD-FAST method to the computation of global sensitivity indices
Reliability Engineering & System Safety
A comparison of uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling
Reliability Engineering & System Safety
Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates
Mathematics and Computers in Simulation
Making best use of model evaluations to compute sensitivity indices
Computer Physics Communications
Variance based sensitivity analysis of model output. design and estimator for the total sensitivity index
Computer Physics Communications
State dependent parameter metamodelling and sensitivity analysis
Computer Physics Communications
Global sensitivity analysis using polynomial chaos expansions
Reliability Engineering & System Safety
An adaptive sampling method for global sensitivity analysis based on least-squares support vector regression
Reliability Engineering & System Safety
Calculations of Sobol’ indices for the Gaussian process metamodel
Reliability Engineering & System Safety
Adaptive experiment design for probabilistic integration
Computer Methods in Applied Mechanics and Engineering
A Bayesian Monte Carlo-based method for efficient computation of global sensitivity indices
Mechanical Systems and Signal Processing
Do Rosenblatt and Nataf isoprobabilistic transformations really differ?
Probabilistic Engineering Mechanics
Cited by (16)
On active learning for Gaussian process-based global sensitivity analysis
2024, Reliability Engineering and System SafetyAn improved sufficient dimension reduction-based Kriging modeling method for high-dimensional evaluation-expensive problems
2024, Computer Methods in Applied Mechanics and EngineeringCollaborative and Adaptive Bayesian Optimization for bounding variances and probabilities under hybrid uncertainties
2023, Computer Methods in Applied Mechanics and EngineeringEffect of uncertainty of material parameters on stress triaxiality and Lode angle in finite elasto-plasticity—A variance-based global sensitivity analysis
2023, Advances in Industrial and Manufacturing EngineeringCombining Bayesian active learning and conditional Gaussian process simulation for propagating mixed uncertainties through expensive computer simulators
2023, Aerospace Science and Technology