Accelerating Gaussian Process surrogate modeling using Compositional Kernel Learning and multi-stage sampling framework

https://doi.org/10.1016/j.asoc.2020.106909Get rights and content

Highlights

  • This study proposes a sequential surrogate modeling based on Gaussian Process (GP).

  • The proposed methods consists of CKL and PLHS for accelerating surrogate modeling.

  • Compositional Kernel learning (CKL) discovers optimal kernel for fitting GP.

  • Progressive Latin hypercube sampling (PLHS) augments samples with diagnostic measure.

  • Numerical experiments show that the proposed method outperforms existing methods.

Abstract

Surrogate modeling is becoming a popular tool to approximate computationally-expensive simulations for complex engineering problems. In practice, there are still difficulties in surrogate modeling as follows: (1) efficient learning for functional relationship of simulation models and (2) diagnostics for the surrogate model. In order to address these difficulties simultaneously, this paper proposes a new sequential surrogate modeling by integrating a Compositional Kernel Learning (CKL) method for Gaussian process into a sequential sampling strategy termed the Progressive Latin Hypercube Sampling (PLHS). The CKL enables efficient learning capability for complex response surfaces based on richly structured kernels, while the PLHS sequentially generates nested samples by maintaining desired properties for distribution. Furthermore, this sequential sampling framework allows users to monitor the diagnostics of the surrogate model and assess the stopping criteria for further sampling. In order to demonstrate useful features of the proposed method, nine test functions were assembled for numerical experiments to cover different types of problems (i.e., scale and complexity). The proposed method was evaluated with a set of surrogate modeling techniques and sampling methods in terms of performance, diagnostics and computational cost. The results show that (1) the proposed method can learn various response surfaces with fewer training samples than other methods; and (2) the proposed method only provides a reliable diagnostic measure for global accuracy over different types of problems.

Introduction

Simulation model such as Finite Element Analysis (FEA) and Computational Fluid Dynamics (CFD) is a mathematical representation of a real-world physical problem implemented in a computer code. Nowadays, the simulation models have been extensively used in various types of engineering problems (e.g., domain exploration, design optimization, sensitivity/uncertainty analysis and inverse analysis) [1], [2], [3], [4], because physical experiments are either highly expensive or technically impossible.

As physical knowledge and computing power become more advanced, more sophisticated simulation models are gaining widespread use to tackle various complex engineering problems. These simulation models typically have non-linear and complex response surfaces with large input spaces [5], [6]. In addition, these simulation models often require huge computational resources (i.e., long run-time with huge computing power). If the analysis of the simulation model is iterated many times, the computational process would be highly challenging under limited resources.

To mitigate the computational burden, surrogate models have been gaining a considerable attention as a cost-effective substitute for the simulation model [7], [8], [9], [10]. Since the deterministic simulation model produces identical outputs with identical inputs [3], the response surface of the simulation model can be represented by a mathematical/statistical representation [11]. This representation is referred to as a surrogate model, also known as response surface model, emulator and meta-model. Once the surrogate model is constructed, the surrogate model is implemented without running additional simulations for design optimization, design space exploration and sensitivity/uncertainty analysis.

Based on the purpose of the engineering problems, surrogate modeling can be categorized into (1) global surrogate modeling and (2) black-box optimization. The global surrogate modeling aims at mimicking the response surface over the input space (for sensitivity or uncertainty analysis) [12], [13], [14], while the black-box optimization utilizes a sequential design strategy for global optimization of the black-box functions [15], [16], [17]. The scope of this study is confined to the global surrogate modeling. The global surrogate modeling consists of two stages: (1) sampling stage, wherein a set of simulation runs (known as training samples) is performed over the input space based on sampling strategies; and (2) model-fitting stage, wherein the surrogate model is fitted using the training samples. Among various types of the surrogate modeling techniques and sampling methods, selecting robust methods is still a challenging for practical problems [14].

For the successful global surrogate modeling, the learning capability of the surrogate model is important. There are various types of the surrogate modeling such as polynomial model (POLY) [18], [19], [20], radial basis function (RBF) [19], [20], [21], [22], [23], [24], [25] and Gaussian Process (GP) [19], [20], [22], [23], [25], [26], [27], [28], [29], [30], [31]. Chronologically, Table 1 shows the recent works on comparative study of surrogate modeling. The improvements in computational power significantly increase the research interest for the development of more advanced surrogate modeling to improve learning capability. As a result, the non-parametric models such as GP (also known as Kriging) and RBF are prevalent for the global surrogate modeling, since they can approximate the response surface more flexibly than parametric models (e.g., polynomial model) [9]. Recently, the enhanced-GPs (such as Blind Kriging and Gradient-enhanced Kriging) have been gaining a considerable attention for the engineering problems [13], [24], [28], [29], [30], [31], [32], [33]. Comprehensive works for the surrogate modeling [14], [34] show that the optimal model is case-dependent to the problem types and modeling setting (e.g., kernel function in GP).

Sampling method (also known as design of experiment, DOE) generates the training samples to gather informative experiments (simulations) for the surrogate modeling. The accuracy of the surrogate model heavily depends on training samples, so that the sampling method is crucial to the predictive quality of the surrogate model. Classical sampling methods in the DOE (e.g., central composite design) can be utilized to generate training samples. However, they usually generate more samples around the boundary regions. For a computational DOE, it is preferred to fill the entire input space (i.e., space-filling) rather than filling the boundary regions [35]. In this context, space-filling sampling (SFS) method has gained much popularity for the surrogate modeling. Latin hypercube sampling (LHS) [36] and Low-discrepancy sequence (e.g., Sobol’s sequence) [37] are the most popular SFS method in various fields.

Conventional SFS method is a single-stage sampling strategy of generating the entire samples at once. The optimal LHS has been developed to improve space-filling properties. The optimal LHS optimizes some space-filling criteria (such as maximin distance criterion [38], [39], orthogonal arrays criterion [40], [41] and so on [42], [43], [44]) to generate the training samples. In the conventional SFS method, the size of the training samples should be pre-determined. However, it is difficult even for experts to determine an appropriate size of the training samples in advance. Hence, this difficulty actuates the development of the sequential SFS method [5], [45], [46], [47], [48], [49]. Table 2 shows the development of the SFS method in chronological order. The sequential SFS method has developed based on the conventional SFS to augment training samples. To ensure the desired space-filling property, the sequential SFS method treats the sampling process as a set of optimization problems by optimizing some space-filling criteria. It is worth noting that the sequential SFS method with a nested design has recently gained a considerable attention. The nested design sequentially generates successive sets of samples by making former samples as a subset of the latter samples [5], [45], [46], [48], [49].

The accuracy of the surrogate model is strongly dependent to (1) learning capability of the surrogate model and (2) training samples. The accuracy of the surrogate model is often unsatisfactory to represent the response surface of the simulation model. The training samples should be sufficient to capture the response surface, while the learning capability of the surrogate model should be maximized to learn the response surface effectively. In general, they interact with each other and have the influence on the accuracy of the surrogate model. For example, a large size of training samples is required to get reasonable accuracy of the surrogate model under inefficient learning capability. In this context, it is important to validate the accuracy of the surrogate model before its implementation. However, there is little research for diagnostics of the surrogate model [3], [13], [51].

To address the issues simultaneously, this paper proposes a new surrogate modeling based on the GP by incorporating a Compositional Kernel Learning (CKL) method [52], [53], [54] into a sequential SFS strategy termed the Progressive Latin Hypercube Sampling (PLHS) [5]. The CKL is developed by Duvenaud, et al. [52] in the machine learning community. The covariance kernels of the GP are known to be closed under compositional rules (i.e., sum and product) [52]. Thus, the CKL automatically discovers a compositional kernel for a richly structured kernel to represent complex properties of the function. Although the CKL is outstanding to learn both simple and complex functions, the CKL is somewhat new for surrogate modeling. For the diagnostics of the GP with appropriate size of the training samples, the proposed method introduces the PLHS. The PLHS successively generates a series of the sub-samples (i.e., smaller slices) by maintaining desired properties for the distribution (space-filling and projective properties). Sheikholeslami, et al. [5] demonstrated that the PLHS shows the outstanding performance that scales effectively with the dimensionality of the problem. A series of the sub-samples in the PLHS are Latin hypercube as shown in Sheikholeslami, et al. [5], so that the sub-samples preserves projective properties (i.e., Latin hypercube) along with space-filling properties (i.e., maximin distance criterion). For the diagnostics of the surrogate model, the proposed method utilizes two consecutive sub-samples in the PLHS as training and validation samples, respectively. By virtue of using the nested samples in the PLHS, the proposed method allows users to monitor the diagnostics of the GP and assess the stopping criteria for further sampling. Numerical experiments reveal that (1) the proposed method generally outperforms or performs similarly to the best one among a set of surrogate models, so that the proposed method can learn the various types of response surfaces (i.e., scale and complexity) flexibly and efficiently; and (2) the proposed method only provides robust correlations between accuracies from validation samples (generated by the PLHS) and test samples (not available in real applications). These results indicate that the only proposed method can ensure a diagnostic measure for the global surrogate modeling via the proposed framework.

The remaining of this study is organized as follows. Section 2 firstly introduces a Gaussian process model with the CKL. Then, the PLHS is presented for diagnostics of the GP to find an appropriate size of the training samples. In Section 3, the proposed method for surrogate modeling is introduced. Section 4 shows descriptions of numerical experiments. Then, the results of numerical experiments are provided in Section 5. The discussion of the proposed method is given in Section 6. Lastly, Section 7 summarizes the conclusion. Hereafter, the boldface letters indicate vectors or matrices.

Section snippets

Gaussian Processes

A Gaussian Process (GP) is a Bayesian non-parametric model to provide an analytically tractable way of learning a complex function from input to output [26]. GP is a distribution over functions such that any finite set of function values has a joint multivariate Gaussian distribution. In this context, GP is completely defined by a mean function (mX) and covariance kernel (KX,X|ψ). The response surface (ηX) is assumed to be a finite set of function values (YX) with input represented by XRp;

Proposed method using CKL and PLHS for surrogate modeling

Although the GP has been widely used for the surrogate modeling due to its advantages (e.g., prediction uncertainty), there still remains three difficulties as follows: (1) choice of the proper covariance kernel; (2) appropriate size of the training samples; and (3) diagnostics for accuracy. To address these difficulties simultaneously, this study proposed an efficient surrogate modeling method based on GP by integrating the CKL with the PLHS. Fig. 6 shows the flowchart of the proposed method.

Test function and their characteristics

To compare the proposed method with other methods, nine test functions were selected from literatures to cover the range of dimensionality and complexity. The mathematical representations of the test functions are summarized in Appendix B. These test functions are well-known benchmark problems in surrogate modeling and optimization problems. Table 3 summarizes the characteristics of the test functions. In terms of the dimensionality (d), the test functions can be categorized into three levels:

Results and analysis

In order to account for sampling variability, different random seeds were used to generate ten different replicates of training samples. Based on the ten replicates, numerical experiments were performed to evaluate the robustness against the random components in the proposed methods. The proper size of the training samples depends on the types of the problems and computational budgets. Since there is no optimal way to determine the size of the training samples, the empirical formula (Eq. (18))

Discussion on proposed method

This section provides the following two issues related to the proposed method: (1) computational complexity and (2) limitation to discontinuous response surfaces.

– Computational complexity

The computational complexity of the GP exponentially increases, according to the sample size (On3). Although the CKL provides the superior learning capability as seen in Section 5.1, the iterative-fitting of the CKL aggravates the computational complexity. Therefore, the proposed method is

Conclusions

This study proposed the sequential surrogate modeling using the Compositional Kernel Learning (CKL) with the Progressive Latin Hypercube Sampling (PLHS). The proposed method can improve learning capability for response surfaces with the diagnostic measure for the global accuracy of the Gaussian Process (GP). The proposed method employs the CKL automatically to discover the proper covariance kernel under observed samples. Until the desired accuracy of the GP is achieved, the PLHS is implemented

CRediT authorship contribution statement

Seung-Seop Jin: Conceptualization, Methodology, Software, Writing, Revision, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1C1C1009236).

References (73)

  • ZhouY.

    An enhanced Kriging surrogate modeling technique for high-dimensional problems

    Mech. Syst. Signal Process.

    (2020)
  • ØstergårdT.

    A comparison of six metamodeling techniques applied to building performance simulations

    Appl. Energy

    (2018)
  • JohnsonM.E.

    Minimax and maximin distance designs

    J. Statist. Plann. Inference

    (1990)
  • LoeppkyJ.L.

    Projection array based designs for computer experiments

    J. Statist. Plann. Inference

    (2012)
  • YeK.Q.

    Algorithmic construction of optimal symmetric Latin hypercube designs

    J. Statist. Plann. Inference

    (2000)
  • WuZ.

    Efficient space-filling and near-orthogonality sequential Latin hypercube for computer experiments

    Comput. Methods Appl. Math.

    (2017)
  • WardE.J.

    A review and comparison of four commonly used Bayesian and maximum likelihood model selection tools

    Ecol. Model.

    (2008)
  • RazaviS.

    Numerical assessment of metamodelling strategies in computationally intensive optimization

    Environ. Model. Softw.

    (2012)
  • BroadD.R.

    A systematic approach to determining metamodel scope for risk-based optimization and its application to water distribution system design

    Environ. Model. Softw.

    (2015)
  • RakshitP.

    Realization of learning induced self-adaptive sampling in noisy optimization

    Appl. Soft Comput.

    (2018)
  • RedouaneK.

    Adaptive surrogate modeling with evolutionary algorithm for well placement optimization in fractured reservoirs

    Appl. Soft Comput.

    (2019)
  • KennedyM.C.

    Bayesian calibration of computer models

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (2001)
  • BastosL.S.

    Diagnostics for Gaussian process emulators

    Technometrics

    (2009)
  • OberkampfW.L.

    Verification and Validation in Scientific Computing

    (2010)
  • LiuH.

    A survey of adaptive sampling for global metamodeling in support of simulation-based complex engineering design

    Struct. Multidiscip. Optim.

    (2018)
  • MorrisM.D.

    Bayesian design and analysis of computer experiments: Use of derivatives in surface prediction

    Technometrics

    (1993)
  • JonesD.R.

    A taxonomy of global optimization methods based on response surfaces

    J. Global Optim.

    (2001)
  • CanasL.S.

    Gaussian Processes with Optimal Kernel Construction for Neuro-Degenerative Clinical Onset Prediction

    (2018)
  • KajbafA.A.

    Application of surrogate models in estimation of storm surge: A comparative assessment

    Appl. Soft Comput.

    (2020)
  • KianifarM.R.

    Performance evaluation of metamodelling methods for engineering problems: towards a practitioner guide

    Struct. Multidiscip. Optim.

    (2020)
  • JonesD.R.

    Efficient global optimization of expensive black-box functions

    J. Global Optim.

    (1998)
  • Henrández-LobatoJ.M.

    Predictive entropy search for efficient global optimization of black-box functions

  • MüllerJ.

    Surrogate optimization of computationally expensive black-box problems with hidden constraints

    INFORMS J. Comput.

    (2019)
  • BoxG.E.P.

    Empirical Model-Building and Response Surface

    (1986)
  • ZhaoL.

    Metamodeling method using dynamic kriging for design optimization

    AIAA J.

    (2011)
  • BuhmannM.D.

    Radial Basis Functions : Theory and Implementations

    (2003)
  • Cited by (6)

    • Gaussian process-assisted active learning for autonomous data acquisition of impact echo

      2022, Automation in Construction
      Citation Excerpt :

      Typically, internal damage is expected to exist in internal regions rather than in boundary regions. In this context, filling the entire input space (i.e., space-filling) is preferred over grid-based sampling [30,45]. In the proposed framework, an optimal Latin hypercube design (LHD) was utilized to generate the initial samples by optimizing the maximin distance criterion.

    View full text