Using a Gaussian process regression inspired method to measure agreement between the experiment and CFD simulations
Introduction
Understanding complex flow phenomena is an important job in many in many fields, e.g., performance of electric gadgets (Conficoni et al., 2015; Kaplan et al., 2013), flow structure interaction (Glück et al., 2001; Hou et al., 2012), thermal hydraulics in complex system (Amin et al., 2018; Duan et al., 2014; Duan and He, 2017a, 2017b), as well as dispersion of air pollutants (Al-Abidi et al., 2013; Allegrini et al., 2014; Gilani et al., 2016; Gromke et al., 2015). Thanks to the rapid development of commercial computational fluid dynamics (CFD) software in conjunction with growing computing power, the replacement of full-scale experiments with sophisticated CFD models is a growing trend. However, high-fidelity CFD methods, such as DNS and LES, are still impractical in most industrial cases. Currently, the most widely used CFD methods, such as RANS and Hybrid LES/RANS, are subject to various levels of simplification and assumption, which introduce model bias into the predictions. Therefore, identifying the suitability of a CFD model for specific scenarios is an ongoing endeavour.
Graphical comparison is a common approach to assess the performance of CFD models or validation and calibration of turbulence models, for instance in (Billard et al., 2012; Duan et al., 2019, 2018a, 2018b; Keshmiri et al., 2016b, 2016a; Launder and Spalding, 1974; Menter, 2009; Revell et al., 2006; Shih et al., 1995). The approach qualitatively determines the agreement between the model and experiments by observing the data plotted in figures. By its nature, the conclusion of a graphical comparison is observer dependent and may be biased, especially when the results of different models are close to each other. Therefore, quantitative assessment of the agreement between CFD models and measurements is important in terms of providing a more objective conclusion for ranking the models and validation. The widely accepted goal of validation is the determination of the degree to which a model is an accurate representation of the physics from the perspective of intended uses of the model (AIAA, 1998; ASME, 2006; Oberkampf and Trucano, 2008; Oberkampf et al., 2002).
A review of previously developed validation methods can be found in the article (Lee et al., 2016; Ling and Mahadevan, 2013). The methods for assessing the validity of a numerical model can be classified into two types, namely hypothesis tests and metrics assessment. The hypothesis test aims to accept/reject the model, whilst metrics assessments focus on quantifying the agreement between the model and experiment. The hypothesis test, such as the p-value test and the Bayesian factor, were developed by considering either statistics of the experiment or the numerical simulation. Meanwhile, many metric assessment methods were developed and assessed, such as Euclidean distance (Audouin et al., 2011; Peacock et al., 1999), Mahalanobis distance (Rebba and Mahadevan, 2006; Zhao et al., 2017), the confidence interval (Barone et al., 2006; Oberkampf and Barone, 2006), and area metrics (Ferson et al., 2008; Ferson and Oberkampf, 2009). Both Euclidean distance and Mahalanobis distance measure the difference between two vectors. The former treats the squared error of each element in the vector equally, while the latter considers the bias due to the statistical features of the elements. The confidence interval is designed to provide the confidence level of the model error. Area metrics assess the similarity of the cumulative distribution of different stochastic processes obtained using experiments and simulations.
Additionally, a high fidelity database is required to validate the numerical method or assess the capability of a physical model. Instead of building a new test rig or starting a high-fidelity simulation, such as a direct numerical simulation (DNS), (which may not be practical) utilizing historical experimental databases is a more feasible and economical choice. In most cases, it is impossible to use measurements directly in a quantitative validation, because of the misalignment between the measurements and numerical outputs in the spatial or parameter domain. Barone et al. (2006), as well as, Oberkampf and Barone (2006) suggested to first fit a non-linear regression model to the measurements and then use the fitted curve to replace the experimental measurement. Accordingly, the confidence interval on the model error as well as the global agreement metric can be calculated even when the measures are sparse over the range of input parameter. One of the major limitations of this approach comes from the regression method. It is known that the functional form chosen for the non-linear regression will have a large impact on the results. Based on the Bayesian calibration procedure developed by Kennedy and O'Hagan (2001). Wang et al. (2009) suggested a framework using Gaussian process regression (GPR) to fit the curve as well as the confidence interval for the model error . The procedure also provides the mean and confidence interval on the model bias (error) over the observation domain. As a further development of Wang et al. (2009), Chen et al. (2008) developed a metric based on the p-value test designed to determine the accuracy of a model for design purposes. These existing methods are not suited to ranking the performance of different numerical models.
This work focuses on quantitatively assessing the difference between experiment and simulations, as well as ranking the numerical models used. It consists of two steps: curve-fitting of the observations using GPR and evaluation of the distance between the fitted curve and numerical outputs using metrics. The statistically weighted squared error is used to represent the local distance, whilst the standardised Euclidean distance is used to provide the overall distance between the model outputs and the fitted curve. The numerical models will be ranked in terms of performance by the comparison of their standardised Euclidean distances.
In current practice, CFD models are treated as deterministic. However, the numerical model is affected by many uncertainties arising from inadequate knowledge of physical system, e.g., variability in physical properties (Rebba and Mahadevan, 2008). Exploring the stochastic feature of a variable, using a numerical method, requires hundreds or even thousands of simulations. However, CFD simulations are often computationally expensive, especially for scale-resolved simulations. As a result, it is often too expensive to recreate the probability distributions of a variable purely using CFD simulations. It is possible to simulate the stochastic feature using a surrogate model to mimic the CFD model in the event domain. This will be pursued in our future work.
The rest of this paper is organised as follows: a brief introduction to the GPR method is given in Section 2. The definition of the statistically weighted squared error and the standardised Euclidean distances are listed in Section 3. Section 4 contributes to describing the experiment by Volvo (Sjunnesson et al., 1992) and CFD models. GPR predictions are validated in Section 5 before demonstrating the quantification procedure in Section 6. General conclusions are given in Section 7.
Section snippets
Gaussian process regression
The GPR, also known as kriging (Krige, 1951), provides the interpolation of the unknown based on prior knowledge. Compared to parametric regression methods, such as least-squares linear regression and polynomial regression, GPR is a more rigorous method for the treatment of complex noisy non-linear functions (Chilenski et al., 2015). Instead of the use of a prescribed functional form for the regression function, GPR uses the prior information (the known data) to estimate the posterior (unknown
Distance between two datasets
The Euclidean distance is sufficient for describing the difference between two deterministic datasets. In the calculation of Euclidean distance each data point contributes equally to the value. However, experimental observations are generally subject to random fluctuations of different magnitudes. As a result, the Euclidean distance is not suitable for assessing the agreement between experimental observations and numerical outputs. It is reasonable to weight the coordinates subject to greater
Descriptions of the demonstration case
The isothermal and non-reaction prism bluff-body flow measured by Volvo company (Sjunnesson et al., 1992, 1991) is simulated by different scale-resolving CFD methods. The experimental facility, flow conditions as well as the setting-up of CFD models are included in this section.
LOO-CV of the GPR model
Before training the GPR models, it is useful to explore the shape of the measured profiles. Profiles of U/Ub, urms/Ub and vrms/Ub in the y-direction should be symmetric about y/a = =0.0. Hence, it is reasonable to reflect observations of U/Ub, urms/Ub and vrms/Ub about y/a = =0.0. Moreover, the measurements of are rotated 180˚ around the point (, y/a = =0.0). The rotated measurements are then added to the original dataset. In this way, more data points are obtained for the
Quantification of the agreement between the experiment and CFD models
The agreement between the experiment (Sjunnesson et al., 1992, 1991) and simulations are examined in this section using the quantities on the previously defined locations, as shown in Fig. 2(b). The difference between the experimental measurements and numerical outputs is quantified by the distance between the GPR predictions and CFD outputs. The profiles of the quantities of interests (QoIs) obtained by various turbulence models, as well as, the statistically weighted squared error (
Summary and conclusions
A GPR based method for measuring the agreement between outputs of CFD simulations and experimental measurements is proposed in this paper. The GPR model, trained and validated using the measures, provides pseudo-experimental measurements at positions in the simulation where no direct experimental measurements exist. The differences between the numerical models and experiment are mimicked by comparing the numerical output to the predictions of validated GPR models. Quantified information is
Declaration of Competing Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgement
The authors would like to thank Rolls-Royce for the financial support via the project grant P65165_MECM.
Reference (63)
- et al.
CFD applications for latent heat thermal energy storage: a review
Renew. Sustain. Energy Rev.
(2013) - et al.
Buoyant flows in street canyons: validation of CFD simulations with wind tunnel measurements
Build. Environ.
(2014) - et al.
Quantifying differences between computational results and measurements in the case of a large-scale well-confined fire scenario
Nucl. Eng. Des.
(2011) - et al.
Application of recently developed elliptic blending based models to separated flows
Int. J. Heat Fluid Flow
(2012) - et al.
An assessment of eddy viscosity models on predicting performance parameters of valves
Nucl. Eng. Des.
(2019) - et al.
Model validation and predictive capability for the thermal challenge problem
Comput. Methods Appl. Mech. Eng.
(2008) A computational study of combustion instabilities due to vortex shedding
Proc. Combust. Inst.
(2000)- et al.
CFD simulation of stratified indoor environment in displacement ventilation: validation and sensitivity analysis
Build. Environ.
(2016) - et al.
Computation of fluid–structure interaction on lightweight structures
J. Wind Eng. Ind. Aerodyn.
(2001) - et al.
CFD analysis of transpirational cooling by vegetation: case study for specific meteorological conditions during a heat wave in Arnhem, Netherlands
Build. Environ.
(2015)
The numerical computation of turbulent flows
Comput. Methods Appl. Mech. Eng.
Quantitative model validation techniques : new insights
Reliab. Eng. Syst. Saf.
Remarks on multi-output gaussian process regression
Knowl. Based Syst.
Measures of agreement between computation and experiment: validation metrics
J. Comput. Phys.
Verification and validation benchmarks
Nucl. Eng. Des.
Computational methods for model reliability assessment
Reliab. Eng. Syst. Saf.
Validation of models with multivariate output
Reliab. Eng. Syst. Saf.
A stress strain lag eddy viscosity model for unsteady mean flow
Int. J. Heat Fluid Flow
A new k-epsilon eddy viscosity model for high Reynolds number turbulence flows
Comput. Fluids
Validation metric based on mahalanobis distance for models with multiple correlated responses
Reliab. Eng. Syst. Saf.
Guide for the Verification and Validation of Computational Fluid Dynamics Simulations
Large eddy simulation study on forced convection heat transfer to water at supercritical pressure in a trapezoid annulus
J. Nucl. Eng. Radiat. Sci.
Guide for verification and validation in computational solid mechanics
Am. Soc. Mech. Eng. PTC
Validation case study: prediction of compressible turbulent mixing layer growth rate
AIAA J.
A design-driven validation approach using Bayesian prediction models
J. Mech. Des.
Improved profile fitting and quantification of uncertainty in experimental measurements of impurity transport coefficients using Gaussian process regression
Nucl. Fusion
Energy-aware cooling for hot-water cooled supercomputers
A validation of CFD methods on predicting valve performance parameters
Assessments of different turbulence models in predicting the performance of a butterfly valve
Large eddy simulation of a buoyancy-aided flow in a non-uniform channel – Buoyancy effects on large flow structures
Nucl. Eng. Des.
Heat transfer of a buoyancy-aided turbulent flow in a trapezoidal annulus
Int. J. Heat Mass Transf.
Cited by (23)
Multi-objective optimization of stirring tank based on multiphase flow simulation
2023, Chemical Engineering Research and DesignCitation Excerpt :For problems such as solid-liquid suspension where neither the mathematical mappings between design variables and objectives nor the constraints are available, the application of surrogate models is a viable option. Several surrogate models can be currently implemented in conjunction with CFD to address engineering problems, including Response Surface Methods (Lin et al., 2019; Aghbolaghy and Karimi, 2014; Song et al., 2014), Kriging Models (Koziel et al., 2016; Shen et al., 2022; Nouri et al., 2018), and Gaussian Process Regression Models (Duan et al., 2019; Morita et al., 2022). Particularly, Kriging Models can be used in numerical experiments of engineering issues to predict unknown points by interpolating a finite number of simulation results (Kleijnen, 2009; Ulaganathan et al., 2014).
Step heating thermography supported by machine learning and simulation for internal defect size measurement in additive manufacturing
2022, Measurement: Journal of the International Measurement ConfederationCitation Excerpt :Those prediction errors that were less than the threshold (ε) were ignored and treated as equal to zero. Finally, the Gaussian Processes Regression (GPR) methods are based on the application of non-parametric kernel functions based on probabilistic models (Bayesian inference) [43]. They are non-parametric methods that are usually more suitable for complex problems than the previously described standard regression methods, especially for the treatment of complex and noisy nonlinear functions [44] and for their cross-validation.
You only design once (YODO): Gaussian Process-Batch Bayesian optimization framework for mixture design of ultra high performance concrete
2022, Construction and Building MaterialsCitation Excerpt :The GP is a supervised non-parametric ML method that can provide a prediction of an unknown response variable based on prior collected data. Unlike parametric regression such as least-square regression, GP can provide a more rigorous method in dealing with noisy and complex data [32] and is normally used to provide surrogate models for complex computational ML algorithms [33–36]. The output of GP is a posterior probability of the response variable with the input parameters used as a prior.
Non-intrusive semi-analytical uncertainty quantification using Bayesian quadrature with application to CFD simulations
2022, International Journal of Heat and Fluid FlowThe effect of inlet flow conditions upon thermal mixing and conjugate heat transfer within the wall of a T-Junction
2021, Nuclear Engineering and Design