Review
Efficiency of uncertainty propagation methods for moment estimation of uncertain model outputs

https://doi.org/10.1016/j.compchemeng.2022.107954Get rights and content

Highlights

  • Seven uncertainty propagation methods were assessed over a set of test functions.

  • The best uncertainty propagation method depends on model characteristics.

  • Numerical integration methods are efficient for models with few uncertain inputs.

  • Monte-Carlo methods are recommended for unknown model characteristics.

  • Polynomial chaos expansion is reliable for estimating the first two moments.

Abstract

Uncertainty quantification and propagation play a crucial role in designing and operating chemical processes. This study computationally evaluates the performance of commonly used uncertainty propagation methods based on their ability to estimate the first four statistical moments of model outputs with uncertain inputs. The metric used to assess the performance is the minimum number of model evaluations required to reach a certain confidence level for the moment estimates. The methods considered include Monte-Carlo simulation, numerical integration, and expansion-based methods. The true values of the moments were calculated by high-density sampling with Monte-Carlo simulations. Ninety-five functions with different characteristics were used in the computational experiments. The results reveal that, despite their accuracy, numerical integration methods’ performance deteriorates quickly with increases in the number of uncertain inputs. The Monte-Carlo simulation methods converge to the moments’ true values with the minimum number of model evaluations if model characteristics are not considered or known.

Introduction

With advances in computing systems and improved computational power, simulation models have become popular methods for assisting with decision-making in chemical process design and operation. Many uncertainties present in the simulation models, e.g., in the model inputs and/or parameters, model formulations, and numerical calculations, cause their outputs to be uncertain. In recent years, the uncertainty due to numerical calculations has reduced significantly with advanced computational power (Hüllen et al., 2019). Therefore, the primary sources of uncertainty in simulation outputs are uncertain inputs, uncertain parameters, and model form uncertainty. This study investigates the uncertainty of simulation model outputs due to extrinsic uncertainty, which is the uncertainty resulting from a predefined number of uncertain inputs with known distribution parameters (Ankenman et al., 2008).

Model uncertainty is studied and characterized using the uncertainty quantification (UQ) methods. The UQ methods are also used to reduce uncertainties in the systems to generate reliable output values and increase confidence in the models (Miller et al., 2014). Important steps of UQ are (1) identification of uncertainty sources, (2) characterization of the sources, (3) uncertainty propagation (UP), and (4) analyzing the uncertainties (Gel et al., 2013). Uncertainty propagation investigates the contribution of uncertain sources to the final uncertainty of the model. When only extrinsic uncertainty is considered, the UP methods propagate the uncertainty of the inputs (X) to the model outputs (Y=g(X)) of the model g(.) (Lee and Chen, 2009). For propagating extrinsic uncertainty to outputs, UP methods first require selecting the appropriate statistical representation for the uncertain input variables. Next, the UP is carried out for the model to make statistical inferences regarding the outputs. Statistical inferences regarding the uncertain outputs are generally carried out through estimating three main statistical concepts: the probability density function of the outputs, statistical moments of the outputs, and the probability of a certain outcome, such as failure, based on output distribution (Yang et al., 2017).

There are many challenges in UQ and UP, such as discontinuous response surfaces, selection of significant uncertain parameters for models with high dimensionality, highly complex physical/simulation models, and computational cost associated with UP. There are many UP methods in the literature addressing parts of these challenges (Groen et al., 2014; Luo and Yang, 2017; Wang and Sheen, 2015).

Uncertainty propagation methods are divided into two groups, intrusive and non-intrusive methods. In intrusive methods, the model formulation is needed and modified to propagate input uncertainty. The models are treated as black boxes for non-intrusive methods. Lee and Chen (2009) categorized the non-intrusive UP methods into five groups, (1) simulation-based methods, e.g., Monte Carlo (MC) simulations, (2) local expansion based methods, (3) most probable point-based methods, (4) functional expansion–based methods, e.g., polynomial chaos expansion (PCE), and (5) numerical integration-based methods. It has been established that the moment estimates obtained using local expansion-based UP methods are significantly different from the true values for models with high nonlinearities (Jia et al., 2019; Lee and Chen, 2009). Most probable point-based UP methods are typically used for reliability applications and do not provide accurate estimates of higher statistical moments (Pattabhiraman et al., 2010; Padulo et al., 2007). In addition to these five categories, response-surface-based methods have been used in recent years, where the models of interest are represented through surrogate models (Murcia et al., 2018; Sofi et al., 2020; Tripathy et al., 2016). The response-surface-based methods encompass the fourth category, which is functional expansion-based methods.

Most UP methods require the evaluation of complex simulation models and many model runs (Liu and Gupta, 2007). Carrying out UP for complex or high fidelity models that are computationally expensive to evaluate could be prohibitive for achieving accurate results with some UP methods (Rajabi, 2019). Hence, selecting the appropriate UP method is crucial for efficient and accurate UP.

Multiple studies compared the performance of different UP techniques in terms of accuracy and efficiency to guide selecting an appropriate UP method. Several of these studies compare simulation-based methods to other categories of UP methods. For instance, Klavetter et al. (2012) compared perturbation, Taylor series expansion, and Monte Carlo methods in propagating the uncertainty in slug length and liquid entrainment in gas core to the outputs of a multiphase flow model. The results stated that Taylor series expansion overestimated variance for most outputs, and the other two methods yielded comparable estimates. However, the perturbation method may not provide reasonable uncertainty estimates for models that are not monotonically increasing or decreasing, and it does not provide confidence levels (Klavetter et al., 2012).

Several studies investigated simulation-based methods versus functional expansion-based methods. Safta et al. (2017) and Hunt et al. (2015) compared the accuracy and efficiency of MC simulations, PCE, and Quasi-Monte Carlo (QMC) simulation methods. Both studies concluded that the PCE required fewer model evaluations to converge to the true value of the output mean for the test functions. Aleti et al. (2018) studied the efficiency of MC simulation and PCE methods based on the number of sample points used to estimate the output distribution accurately. The results revealed that the PCE was 90% more efficient than MC methods in terms of the number of numerical calculations.

Jia et al. (2019) evaluated the performance of MC simulation and the numerical integration approaches, including Sparse grid numerical integration (SG), Univariate dimension reduction (UDR), and extended sparse grid methods. They concluded that the SG methods were the most efficient in estimating the first four moments of the output requiring the fewest model evaluations. Allen and Camberos (2009) compared simulation-based methods to response surface approaches to estimate the output probability density function and calculate the probability of failure, defined as the probability of an event that the output value exceeds a specific critical level, using the probability density function. They employed two models as case studies, one with high nonlinearity and one with high dimension. The results were evaluated based on the number of required model evaluations to predict the desired uncertainty metrics. They concluded that response-surface methods, specially polynomial chaos expansion, accurately estimated the probability of failure with the lowest number of samples compared to other methods. One other conclusion was that accurate output distribution estimates required many samples from the uncertain input space and many model evaluations.

Some studies only considered different simulation-based methods and compared their performances. Both Burhenne et al. (2011) and Hou et al. (2019) studied MC and QMC methods by employing different sampling techniques. The performance was assessed based on accuracy and efficiency in estimating the output mean for a set of test functions in both studies and standard deviation in Hou et al. (2019). The results suggested that QMC methods are efficient and outperform MC methods in most cases.

Other studies investigated the difference in the performance of other UP method categories. Padulo et al. (2007) employed local expansion and most probable point-based approaches in their study. First- and third-order Taylor series expansion and Sigma point methods were used to estimate output uncertainty for four different test functions. Sigma point methods provided better estimates of the output mean and standard deviation for input distributions with high variance. Sigma point methods do not require derivatives, which gives them a computational advantage over Taylor series expansion for functions with expensive derivative calculations. Rajabi (2019) and Tardioli et al. (2016) investigated different response surface-based methods. Rajabi (2019) compared PCE to Gaussian Process Emulation (GPE). The study suggested that although GPE had lower normalized Root Mean Square Error (nRMSE) in estimating the response surface, PCE estimated output mean, standard deviation, and probability density function tails with higher accuracy. In addition, PCE tended to have lower statistical dispersion with noisier input probability distributions. Tardioli et al. (2016) compared PCE, Tchebycheff expansions with sparse grids, kriging (Gaussian process modeling), and high dimensional model representation (HDMR) methods. The performance was evaluated based on the methods’ ability to represent the response surface of the test models at different sample sizes using RMSE as the metric. Tchebycheff expansion was concluded to be efficient due to its use of sparse grids and required a lower number of sample points to get to the desired accuracy. The performance of PCE was observed to be inconsistent, and HDMR provided very close results to the Tchebycheff expansion method requiring a lower number of samples to converge to the desired accuracy for all test models. Compared to the other methods, kriging required a high number of model evaluations and had a higher RMSE for all the case studies.

Two papers compared more than two main categories of UP methods. Lee and Chen (2009) and Fahmi and Cremaschi (2016) included MC, Full Factorial Numerical Integration (FFNI), UDR, and PCE methods in a comparative analysis. Fahmi and Cremaschi (2016) also studied different sampling schemas of random, Halton sequences, and Latin Hypercube sampling (LHS) for both MC and PCE. The number of function evaluations used for the analysis was fixed in both studies. The methods were compared in terms of their ability to estimate the four statistical moments of the model outputs. Lee and Chen (2009) concluded that the performance of the methods depended on the model characteristics, such as nonlinearity and uncertain variable interactions. The results from Fahmi and Cremaschi (2016) revealed that simulation-based methods were more sensitive to existing nonlinearities in the test functions than other methods.

The UP method comparisons carried out in the literature demonstrate the importance of the UP method selection for efficiently propagating the extrinsic uncertainty for obtaining accurate estimates of the output uncertainty and allude to the correlation between the accuracy and efficiency of the UP methods and the model characteristics the method is applied to. The goal of this paper is to establish systematic guidelines for selecting an accurate and efficient UP method considering several important factors: (1) model characteristics, (2) the number of uncertain input variables, (3) uncertainty distribution of the input variables, and (4) performance evaluation for higher-order moments like skewness and kurtosis. The current comparison literature does not consider one or more of these factors in their analysis, and none provides rules of thumb for selecting an appropriate UP method that will efficiently, i.e., with the lowest number of model evaluations, estimate the output uncertainty. This study aims to fill this gap by thoroughly evaluating the popular methods considering all key factors. Characterizing the impact of the extrinsic uncertainties on the outputs will result in higher confidence in model predictions, which will aid the efficient and robust design and operation of the systems these models represent.

In this study, we compare seven non-intrusive UP methods based on their ability to estimate the first four statistical moments of the outputs of models with uncertain inputs. The methods considered are Monte Carlo simulation using Sobol sequences (Sobol’, 1967), Halton series (Halton, 1960), and LHS (McKay et al., 1979), FFNI (Duffy et al., 1998), UDR (Rahman and Xu, 2004), SG (Smolyak, 1963), and PCE (Ghanem and Spanos, 1991). An extensive set of test functions were employed to study the effects of (1) nonlinearities, (2) the number of uncertain inputs, and (3) different input uncertainty distributions for establishing guidelines for selecting efficient UP methods. The efficiency of the methods was evaluated using the minimum number of model evaluations required by each method to converge to a preset gap around the true value of the first four statistical moments. Finally, the guidelines were utilized to determine the most appropriate UP method for two case studies. Section 2 briefly explains the UP methods used in this study. Computational experiments are described in Section 3, followed by results and discussion in Section 4. Section 5 summarizes the concluding remarks and outlines future directions.

Section snippets

Uncertainty propagation methods

In this study, uncertainty propagation is conducted by estimating the first four statistical moments of the model output Y=g(X) (Fig. 1). The distribution for the uncertain input vector X is assumed to be known for each model (g(.)). The estimated mean, standard deviation, skewness, and kurtosis are the extracted information of the output uncertainty.

Computational experiments

For computational experiments, all UP methods are implemented for propagating input uncertainty to the outputs of a set of test functions with known analytical forms. Numerous test functions and input distributions are considered in the experiments for studying the impacts of functional forms, the number of uncertain inputs, and distributions on the performance of the UP methods. The test function names, their formulas, and the input distributions are summarized in the Supplementary Materials.

Impact of nonlinearity on the performance of uncertainty propagation methods

Figs. 2 and 3 include boxplots for the minimum number of function evaluations required to converge to a 5% error gap for each of the first four moments for the first group of nonlinearity test functions. In the graphs, P(i)-F stands for the ith order PCE where the integral was estimated using FFNI, P(i)-S using Sobol, and P(i)-H using Halton. The variables, n1, n2, n3, and n4, are the number of functions for which the method did not yield results within the 5% error gap of the true mean,

Conclusions and future directions

One important factor impacting the operation, design, and optimization of engineering processes is the uncertainties in the systems. The uncertainty propagation methods are used to quantify the uncertainty of the system output resulting from the uncertainty of inputs. This paper studied the performance of seven methods from three common groups of uncertainty propagation (UP) methods, including Monte Carlo simulation-based methods, numerical integration methods, and functional expansion-based

CRediT authorship contribution statement

Samira Mohammadi: Conceptualization, Methodology, Software, Investigation, Formal analysis, Writing – original draft, Visualization. Selen Cremaschi: Conceptualization, Methodology, Writing – review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was funded by NSF grant 1743445 and RAPID Manufacturing Institute, the U.S.A.

References (47)

  • D.C. Miller et al.

    Advanced computational tools for optimization and uncertainty quantification of carbon capture processes

  • J.P. Murcia et al.

    Uncertainty propagation through an aeroelastic wind turbine model using polynomial surrogates

    Renew. Energy

    (2018)
  • S. Pattabhiraman et al.

    Uncertainty analysis for rolling contact fatigue failure probability of silicon nitride ball bearings

    Int. J. Solids Struct.

    (2010)
  • S. Rahman et al.

    A univariate dimension-reduction method for multi-dimensional integration in stochastic mechanics

    Probabilistic Eng. Mech.

    (2004)
  • A. Saltelli et al.

    Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index

    Comput. Phys. Commun.

    (2010)
  • A. Sofi et al.

    Propagation of uncertain structural properties described by imprecise probability density functions via response surface method

    Probabilistic Eng. Mech.

    (2020)
  • R. Tripathy et al.

    Gaussian processes with built-in dimensionality reduction: applications to high-dimensional uncertainty propagation

    J. Comput. Phys.

    (2016)
  • H. Wang et al.

    Combustion kinetic model uncertainty quantification, propagation and minimization

    Prog. Energy Combust. Sci.

    (2015)
  • M. Abramowitz et al.

    Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables

    (1988)
  • M.S. Allen et al.

    Comparison of uncertainty propagation/response surface techniques for two aeroelastic systems

  • B. Ankenman et al.

    Stochastic kriging for simulation metamodeling

  • S. Burhenne et al.

    Sampling based on Sobol′ sequences for Monte Carlo techniques applied to building simulations

  • J. Duffy et al.

    Assessing multivariate process/product yield via discrete point approximation

    IIE Trans. Inst. Ind. Eng.

    (1998)
  • Cited by (4)

    View full text