Gaussian mixture model fitting method for uncertainty quantification by conditioning to production data

Gao, Guohua; Jiang, Hao; Vink, Jeroen C.; Chen, Chaohui; El Khamra, Yaakoub; Ita, Joel J.

doi:10.1007/s10596-019-9823-3

Gaussian mixture model fitting method for uncertainty quantification by conditioning to production data

Original Paper
Published: 07 June 2019

Volume 24, pages 663–681, (2020)
Cite this article

Computational Geosciences Aims and scope Submit manuscript

Guohua Gao ORCID: orcid.org/0000-0001-7224-1004¹,
Hao Jiang¹,
Jeroen C. Vink²,
Chaohui Chen³,
Yaakoub El Khamra¹ &
…
Joel J. Ita¹

572 Accesses
17 Citations
Explore all metrics

Abstract

For most history matching problems, the posterior probability density function (PDF) may have multiple local maxima, and it is extremely challenging to quantify uncertainty of model parameters and production forecasts by conditioning to production data. In this paper, a novel method is proposed to improve the accuracy of Gaussian mixture model (GMM) approximation of the complex posterior PDF by adding more Gaussian components. Simulation results of all reservoir models generated during the history matching process, e.g., using the distributed Gauss-Newton (DGN) optimizer, are used as training data points for GMM fitting. The distance between the GMM approximation and the actual posterior PDF is estimated by summing up the errors calculated at all training data points. The distance is an analytical function of unknown GMM parameters such as covariance matrix and weighting factor for each Gaussian component. These unknown GMM parameters are determined by minimizing the distance function. A GMM is accepted if the distance is reasonably small. Otherwise, new Gaussian components will be added iteratively to further reduce the distance until convergence. Finally, high-quality conditional realizations are generated by sampling from each Gaussian component in the mixture, with the appropriate relative probability. The proposed method is first validated using nonlinear toy problems and then applied to a history-matching example. GMM generates better samples with a computational cost comparable to or less than other methods we tested. GMM samples yield production forecasts that match production data reasonably well in the history-matching period and are consistent with production data observed in the blind test period.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inference of Intermittent Hydraulic Fracture Tip Advancement Through Inversion of Low-Frequency Distributed Acoustic Sensing Data

Article 04 April 2024

Yongzan Liu, Lin Liang & Smaine Zeroug

2.5D Hexahedral Meshing for Reservoir Simulations

Article 04 April 2024

David Lopez, Yoann Coudert-Osmont, … Jeanne Pellerin

A systematic review of data science and machine learning applications to the oil and gas industry

Article Open access 24 September 2021

Zeeshan Tariq, Murtada Saleh Aljawad, … Abdulazeez Abdulraheem

References

Aanonsen, S.I., et al.: The ensemble Kalman filter in reservoir engineering—a review. SPE J. 14(3), 393–412 (2009)
Google Scholar
Alabert, F: The practice of fast conditional simulations through the LU decomposition of the covariance matrix. Math. Geol. 19(5), 369–386 (1987)
Google Scholar
Araujo, M., et al.: Benchmarking of advanced methods for assisted history matching and uncertainty quantification. SPE-193910-MS to be presented at the SPE Reservoir Simulation Conference held in Galveston, Texas, USA, 10-11 April (2019)
Bilmes, J.A.: Gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Technical Report ICSI-TR-97-02, University of Berkeley (1997)
Chen, C., et al.: Global search distributed-Gauss-Newton optimization methods and its integration with the randomized-maximum-likelihood method for uncertainty quantification of reservoir performance. SPE J. 23(5), 1496–1517 (2018). https://doi.org/10.2118/182639-PA
Google Scholar
Chen, Y., Oliver, D.: Ensemble randomized maximum likelihood method as an iterative ensemble smoother. Math. Geosci. 44(1), 1–26 (2012)
Google Scholar
Chen, Y., Oliver, D: Levenberg-Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Comput. Geosci. 17(4), 689–703 (2013)
Google Scholar
Figueiredo, M.: On Gaussian radial basis function approximations: interpretation, extensions, and learning strategies, Proceedings 15th International Conference on Pattern Recognition held in Barcelona, Spain, 3-7 September (2000)
Figueiredo, M., Leitao, J., Jain, A.K.: On fitting mixture models, energy minimization methods in computer vision and pattern recognition. In: Hancock, E., Pellilo, M. (eds.) , pp 54–69. Springer (1999)
Chu, L., Reynolds, A.C., Oliver, D.: Computation of sensitivity coefficients for conditioning the permeability field to well-test data. In Situ 19(2), 179–223 (1995)
Google Scholar
Davis, M: Production of conditional simulations via the LU decomposition of the covariance matrix. Math. Geol. 19(2), 91–98 (1987)
Google Scholar
Ehrendorfer, M: A review of issues in ensemble-based Kalman filtering. Meteorol. Z. 16(6), 795–818 (2007)
Google Scholar
Elsheikh, A.H., Wheeler, M.F., Hoteit, I.: Clustered iterative stochastic ensemble method for multi-modal calibration of subsurface flow models. J. Hydrol. 491, 40–55 (2013)
Google Scholar
Emerick, A.A., Reynolds, A.: Ensemble smoother with multiple data assimilation. Comput. Geosci. 55, 3–15 (2013)
Google Scholar
Evensen, G: Data assimilation: the ensemble Kalman filter. Springer, New York (2007)
Google Scholar
Gao, G., et al.: Robust uncertainty quantification through integration of distributed Gauss-Newton optimization with Gaussian Mixture Model and Parallelized Sampling Algorithms. Paper SPE-191516-MS presented at the SPE Annual Technical Conference and Exhibition, Dallas, Texas, USA, 24-26 September (2018)
Gao, G., et al.: Uncertainty quantification for history matching problems with multiple best matches using a distributed Gauss-Newton method. Paper SPE-181611-MS presented at the SPE Annual Technical Conference and Exhibition, Dubai, UAE, 26–28 September (2016)
Gao, G., et al.: Distributed Gauss-Newton optimization method for history matching problems with multiple best matches. Comput. Geosci. 21(5-6), 1325–1342 (2017)
Google Scholar
Gao, G., et al.: A Gauss-Newton trust region solver for large scale history matching problems. SPE J. 22 (6), 1999–2011 (2017)
Google Scholar
Guo, Z., et al.: Integration of support vector regression with distributed Gauss-Newton optimization method and its applications to the uncertainty assessment of unconventional assets. SPE Reservoir Evaluation & Engineering, 21(4) (2018)
Guo, Z., et al.: Enhancing the performance of the distributed Gauss-Newton optimization method by reducing the effect of numerical noise and truncation error with support-vector regression. SPE J. 23(6), 2428–2443 (2018)
Google Scholar
Grana, D., Fjeldstad, T., Omer, H.: Bayesian Gaussian mixture linear inversion in geophysical inverse problems. Math Geosci. 49(4), 493–515 (2017). https://doi.org/10.1007/s11004-016-9671-9
Google Scholar
Kitanidis, P.: Quasi-linear geostatistical theory for inversing. Water Resources. 31(10), 2411–2419 (1995)
Google Scholar
Liu, N., Oliver, D.: Evaluation of Monte Carlo methods for assessing uncertainty. SPE J. 8(2), 188–195 (2003)
Google Scholar
McLachlan, G.: Finite mixture model. Wiley, New York (2000)
Google Scholar
Meyn, S.P., Tweedie, R.L.: Markov chains and stochastic stability. Springer, London (1993)
Google Scholar
Muthen, B., Shedden, K: Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics. 55, 463–469 (1999)
Google Scholar
Oliver, D.S.: Multiple realization of the permeability field from well-test data. SPE J. 1(2), 145–155 (1996)
Google Scholar
Oliver, D.: Metropolized randomized maximum likelihood for improved sampling from multimodal distributions. SIAM/ASA J. Uncertainty Quantification 5(1), 259–277 (2017)
Google Scholar
Oliver, D.S., Chen, Y: Recent progress on reservoir history matching: a review. Comput. Geosci. 15(1), 185–211 (2011)
Google Scholar
Oliver, D.S., Reynolds, A.C., Liu, N.: Inverse theory for petroleum reservoir characterization and history matching. Cambridge University Press, Cambridge (2008)
Google Scholar
Oliver, D.S., Alfonzo, M: Calibration of imperfect models to biased observations. Comput. Geosci. 22(1), 145–161 (2018)
Google Scholar
Rafiee, J., Reynolds, A.C.: A two-level MCMC based on the distributed Gauss-Newton method for uncertainty quantification. The 16th European Conference on the Mathematics of Oil Recovery, Barcelona, Spain, 3-6 September (2018)
Sondergaard, T., Lermusiaux, P.F.: Data assimilation with Gaussian mixture models using the dynamically orthogonal field equations. Part I: Theory and scheme. Monthly Weather Review. 141(6), 1737–1760 (2013)
Google Scholar
Sondergaard, T., Lermusiaux, P.F.: Data assimilation with Gaussian mixture models using the dynamically orthogonal field equations. Part II: applications. Monthly Weather Review. 141(6), 1737–1760 (2013)
Google Scholar
Stordal, A.: Iterative Bayesian inversion with Gaussian mixtures: finite sample implementation and large sample asymptotics. Comput. Geosci. 19(1), 1–15 (2015)
Google Scholar
Sun, W., Vink, J.C., Gao, G.: A practical method to mitigate spurious uncertainty reduction in history matching workflows with imperfect reservoir model. Paper SPE-182599-MS at the SPE Reservoir Simulation Conference held in Montgomery, TX, USA 20–22 February (2017)
Sung, H.: Gaussian mixture regression and classification. Ph. D thesis, Rice University, Houston, Texas, USA (2004)
Tarantola, A.: Inverse problem theory and methods for model parameter estimation. SIAM (2005)
Yu, G., Sapiro, G., Mallat, S: Solving inverse problems with piecewise linear estimators: from Gaussian mixture models to structured sparsity. IEEE Trans. Image Process. 21(5), 2481–2499 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Shell Global Solutions (US) Inc., 3333 HWY 6 S, Houston, TX, 77082, USA
Guohua Gao, Hao Jiang, Yaakoub El Khamra & Joel J. Ita
Shell Global Solutions International B.V., Kessler Park 1, 2288 GS, Rijswijk, The Netherlands
Jeroen C. Vink
Shell Exploration & Production Company, 150 N. Dairy Ashford Rd, Houston, TX, 77079, USA
Chaohui Chen

Authors

Guohua Gao
View author publications
You can also search for this author in PubMed Google Scholar
Hao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen C. Vink
View author publications
You can also search for this author in PubMed Google Scholar
Chaohui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yaakoub El Khamra
View author publications
You can also search for this author in PubMed Google Scholar
Joel J. Ita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guohua Gao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Hao Jiang works for Memorial Hermann Health System now.

Appendices

Appendix A: Finite mixture models

A finite mixture model [25] with K components, p (x |Θ), is defined as

$$ p_{MM}\left( x\left| \mathrm{{\Theta}} \right. \right)=\sum\nolimits_{k\mathrm{=1}}^{K} {f^{(k)}p^{(k)}(x\mathrm{,}\theta^{(k)})} . $$

(1)

In Eq. 1, x is an n-dimensional random vector, p^(k)(x,𝜃^(k)) is the probability density function (PDF) with parameters 𝜃^(k) for the k th mixture component with 1 ≤k≤K, and f^(k) is the positive-valued mixture weight for the k th mixture component, representing the probability to generate a sample from this component. By definition, we have ${\sum }_{k\mathrm {=1}}^{K} f^{(k)} \mathrm {=1}$. The complete set of parameters to define a mixture model with K components is Θ = {f⁽¹⁾,f⁽²⁾,…,f^(K);𝜃⁽¹⁾,𝜃⁽²⁾,…,𝜃^(K)}.

If each of the K mixture components is a Gaussian with mean μ^(k) and covariance matrix C^(k), i.e.,

$$ \begin{array}{@{}rcl@{}} p^{(k)}\left( x, \theta^{(k)}\right)&=&\left( 2\pi \right)^{-n \left/2\right.}{\text{Det}}^{-1/2}\left( C^{(k)}\right)\\ &\times& \exp\left\{ -\frac{1}{2}\left[ x - \mu^{(k)} \right]^{T}\left[ C^{(k)} \right]^{-1}\left[ x - \mu^{(k)} \right] \right\}, \end{array} $$

(2)

then the finite mixture model defined in Eq. 1 becomes a Gaussian mixture model (GMM). In Eq. 2, 𝜃^(k) = {μ^(k),C^(k)}.

Given a set of N_T samples (or sets of training data points): $T=\left \{ x^{\mathrm {(1)}}\mathrm {,}x^{\mathrm {(2)}}\mathrm {,\mathellipsis ,}x^{(N_{T})} \right \}$, which are assumed to be independent and identically distributed (i.i.d.). For given parameters Θ, the conditional probability of component k for a given sample x⁽ⁱ⁾ is called “membership weight” and according to the Bayes theorem, it can be computed as

$$ w^{(i\mathrm{,}k)}=\frac{f^{(k)}p^{(k)}\left( x^{(i)}\mathrm{,}\theta ^{(k)}\right)}{\sum\nolimits_{j\mathrm{=1}}^{K} {f^{(j)}p^{(j)}\left( x^{(i)}\mathrm{,}\theta ^{(j)}\right)}} , \text{ for } \mathrm{1\le} k\mathrm{\le} K \text{ and } \mathrm{1\le} i\mathrm{\le} N_{T}. $$

(3)

In this expression, the f^(k) is the (“prior”) probability for component k and the likelihood function, i.e., the probability (density) of the sample x⁽ⁱ⁾ from this given component is p^(k)(x⁽ⁱ⁾,𝜃^(k)). The denominator in Eq. 3 is the probability density for sample x⁽ⁱ⁾, and hence, the probability density to generate the N_T i.i.d. samples (for given parameters Θ) is

$$ p\left( x^{\left( 1 \right)},x^{\left( 2 \right)},\mathellipsis, x^{\left( N_{T} \right)}\left|{\Theta}\right.\right)=\prod\nolimits_{i=1}^{N_{T}}\left[\sum\nolimits_{j=1}^{K} {f^{(j)}p^{(j)}\left( x^{(i)},\theta^{(j)}\right)}\right]. $$

(4)

The maximum likelihood estimate of the parameters Θ can now be found using an iterative expectation and maximization (EM) algorithm.

Appendix B: Analytical derivatives of GMM with respect to unknown GMM parameters

To reduce the computational cost for GMM fitting, we need to apply a gradient-based optimization algorithm to minimize the error function, which requires calculation of the analytical gradient of the error function defined in Eq. ?? and/or calculation of the sensitivity matrix of the normalized shift defined in Eq. ??. By the chain rule, we have

$$ \mathrm{\nabla} G\left( {\Theta} \right)\mathrm{=-2}\sum\nolimits_{j\mathrm{=1}}^{N_{T}} {E\left( x^{(j)}\mathrm{;}{\Theta} \right)\mathrm{\nabla} p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)} . $$

(5)

The GMM approximation at a training data point x^(j) is

$$ p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)=p_{\text{GMM}}\left( x^{(j)}\mathrm{,}{\Theta} \right)=\sum\nolimits_{l\mathrm{=1}}^{K} \left[ \beta_{G}^{\left( l \right)}p_{G}^{\left( l \right)}(x^{\left( j \right)}\mathrm{,}H^{(l)}) \right] , $$

(6)

$$ p_{G}^{\left( l \right)}\left( x^{\left( j \right)}\mathrm{,}H^{\left( l \right)} \right)=e^{-\frac{1}{2}\left\{ \left[ x^{\left( j \right)}-\mu^{(l)} \right]^{T}H^{(l)}\left[ x^{\left( j \right)}-\mu^{(l)} \right] \right\}}. $$

(7)

The derivatives of $p_{\text {GMM}}^{\left (j \right )}\left ({\Theta } \right )$ with respect to w^(l), $\beta _{G}^{\left (l \right )}$, and v^(l,k) (for l= 1, 2,…,K and k= 1, 2,…,m) are

$$ \frac{\partial p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)}{\partial \beta_{G}^{\left( l \right)}}=p_{G}^{\left( l \right)}(x^{\left( j \right)},H^{(l)}). $$

(8)

$$ \frac{\partial p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)}{\partial w^{(l)}}\mathrm{=-}\frac{1}{2}\beta_{G}^{\left( l \right)}p_{G}^{\left( l \right)}(x^{\left( j \right)},H^{(l)})\left[ x^{\left( j \right)}\! -\!\mu^{(l)} \right]^{T}H_{G0}^{(l)}\left[ x^{\left( j \right)}-\mu^{(l)} \right]. $$

(9)

$$ \mathrm{\nabla}_{v^{(l\mathrm{,}k)}}p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)\mathrm{=-}\beta_{G}^{\left( l \right)}p_{G}^{\left( l \right)}(x^{\left( j \right)},H^{(l)})\left[ x^{\left( j \right)}\! -\!\mu^{(l)} \right]^{T}v^{(l\mathrm{,}k)}\left[ x^{\left( j \right)}\! -\! \mu^{(l)} \right]. $$

(10)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, G., Jiang, H., Vink, J.C. et al. Gaussian mixture model fitting method for uncertainty quantification by conditioning to production data. Comput Geosci 24, 663–681 (2020). https://doi.org/10.1007/s10596-019-9823-3

Download citation

Received: 22 June 2018
Accepted: 07 May 2019
Published: 07 June 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s10596-019-9823-3

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaussian mixture model fitting method for uncertainty quantification by conditioning to production data

Abstract

Access this article

Similar content being viewed by others

Inference of Intermittent Hydraulic Fracture Tip Advancement Through Inversion of Low-Frequency Distributed Acoustic Sensing Data

2.5D Hexahedral Meshing for Reservoir Simulations

A systematic review of data science and machine learning applications to the oil and gas industry

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendices

Appendix A: Finite mixture models

Appendix B: Analytical derivatives of GMM with respect to unknown GMM parameters

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Gaussian mixture model fitting method for uncertainty quantification by conditioning to production data

Abstract

Access this article

Similar content being viewed by others

Inference of Intermittent Hydraulic Fracture Tip Advancement Through Inversion of Low-Frequency Distributed Acoustic Sensing Data

2.5D Hexahedral Meshing for Reservoir Simulations

A systematic review of data science and machine learning applications to the oil and gas industry

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendices

Appendix A: Finite mixture models

Appendix B: Analytical derivatives of GMM with respect to unknown GMM parameters

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation