Skip to main content
Log in

Gaussian mixture model fitting method for uncertainty quantification by conditioning to production data

  • Original Paper
  • Published:
Computational Geosciences Aims and scope Submit manuscript

Abstract

For most history matching problems, the posterior probability density function (PDF) may have multiple local maxima, and it is extremely challenging to quantify uncertainty of model parameters and production forecasts by conditioning to production data. In this paper, a novel method is proposed to improve the accuracy of Gaussian mixture model (GMM) approximation of the complex posterior PDF by adding more Gaussian components. Simulation results of all reservoir models generated during the history matching process, e.g., using the distributed Gauss-Newton (DGN) optimizer, are used as training data points for GMM fitting. The distance between the GMM approximation and the actual posterior PDF is estimated by summing up the errors calculated at all training data points. The distance is an analytical function of unknown GMM parameters such as covariance matrix and weighting factor for each Gaussian component. These unknown GMM parameters are determined by minimizing the distance function. A GMM is accepted if the distance is reasonably small. Otherwise, new Gaussian components will be added iteratively to further reduce the distance until convergence. Finally, high-quality conditional realizations are generated by sampling from each Gaussian component in the mixture, with the appropriate relative probability. The proposed method is first validated using nonlinear toy problems and then applied to a history-matching example. GMM generates better samples with a computational cost comparable to or less than other methods we tested. GMM samples yield production forecasts that match production data reasonably well in the history-matching period and are consistent with production data observed in the blind test period.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Aanonsen, S.I., et al.: The ensemble Kalman filter in reservoir engineering—a review. SPE J. 14(3), 393–412 (2009)

    Google Scholar 

  2. Alabert, F: The practice of fast conditional simulations through the LU decomposition of the covariance matrix. Math. Geol. 19(5), 369–386 (1987)

    Google Scholar 

  3. Araujo, M., et al.: Benchmarking of advanced methods for assisted history matching and uncertainty quantification. SPE-193910-MS to be presented at the SPE Reservoir Simulation Conference held in Galveston, Texas, USA, 10-11 April (2019)

  4. Bilmes, J.A.: Gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Technical Report ICSI-TR-97-02, University of Berkeley (1997)

  5. Chen, C., et al.: Global search distributed-Gauss-Newton optimization methods and its integration with the randomized-maximum-likelihood method for uncertainty quantification of reservoir performance. SPE J. 23(5), 1496–1517 (2018). https://doi.org/10.2118/182639-PA

    Google Scholar 

  6. Chen, Y., Oliver, D.: Ensemble randomized maximum likelihood method as an iterative ensemble smoother. Math. Geosci. 44(1), 1–26 (2012)

    Google Scholar 

  7. Chen, Y., Oliver, D: Levenberg-Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Comput. Geosci. 17(4), 689–703 (2013)

    Google Scholar 

  8. Figueiredo, M.: On Gaussian radial basis function approximations: interpretation, extensions, and learning strategies, Proceedings 15th International Conference on Pattern Recognition held in Barcelona, Spain, 3-7 September (2000)

  9. Figueiredo, M., Leitao, J., Jain, A.K.: On fitting mixture models, energy minimization methods in computer vision and pattern recognition. In: Hancock, E., Pellilo, M. (eds.) , pp 54–69. Springer (1999)

  10. Chu, L., Reynolds, A.C., Oliver, D.: Computation of sensitivity coefficients for conditioning the permeability field to well-test data. In Situ 19(2), 179–223 (1995)

    Google Scholar 

  11. Davis, M: Production of conditional simulations via the LU decomposition of the covariance matrix. Math. Geol. 19(2), 91–98 (1987)

    Google Scholar 

  12. Ehrendorfer, M: A review of issues in ensemble-based Kalman filtering. Meteorol. Z. 16(6), 795–818 (2007)

    Google Scholar 

  13. Elsheikh, A.H., Wheeler, M.F., Hoteit, I.: Clustered iterative stochastic ensemble method for multi-modal calibration of subsurface flow models. J. Hydrol. 491, 40–55 (2013)

    Google Scholar 

  14. Emerick, A.A., Reynolds, A.: Ensemble smoother with multiple data assimilation. Comput. Geosci. 55, 3–15 (2013)

    Google Scholar 

  15. Evensen, G: Data assimilation: the ensemble Kalman filter. Springer, New York (2007)

    Google Scholar 

  16. Gao, G., et al.: Robust uncertainty quantification through integration of distributed Gauss-Newton optimization with Gaussian Mixture Model and Parallelized Sampling Algorithms. Paper SPE-191516-MS presented at the SPE Annual Technical Conference and Exhibition, Dallas, Texas, USA, 24-26 September (2018)

  17. Gao, G., et al.: Uncertainty quantification for history matching problems with multiple best matches using a distributed Gauss-Newton method. Paper SPE-181611-MS presented at the SPE Annual Technical Conference and Exhibition, Dubai, UAE, 26–28 September (2016)

  18. Gao, G., et al.: Distributed Gauss-Newton optimization method for history matching problems with multiple best matches. Comput. Geosci. 21(5-6), 1325–1342 (2017)

    Google Scholar 

  19. Gao, G., et al.: A Gauss-Newton trust region solver for large scale history matching problems. SPE J. 22 (6), 1999–2011 (2017)

    Google Scholar 

  20. Guo, Z., et al.: Integration of support vector regression with distributed Gauss-Newton optimization method and its applications to the uncertainty assessment of unconventional assets. SPE Reservoir Evaluation & Engineering, 21(4) (2018)

  21. Guo, Z., et al.: Enhancing the performance of the distributed Gauss-Newton optimization method by reducing the effect of numerical noise and truncation error with support-vector regression. SPE J. 23(6), 2428–2443 (2018)

    Google Scholar 

  22. Grana, D., Fjeldstad, T., Omer, H.: Bayesian Gaussian mixture linear inversion in geophysical inverse problems. Math Geosci. 49(4), 493–515 (2017). https://doi.org/10.1007/s11004-016-9671-9

    Google Scholar 

  23. Kitanidis, P.: Quasi-linear geostatistical theory for inversing. Water Resources. 31(10), 2411–2419 (1995)

    Google Scholar 

  24. Liu, N., Oliver, D.: Evaluation of Monte Carlo methods for assessing uncertainty. SPE J. 8(2), 188–195 (2003)

    Google Scholar 

  25. McLachlan, G.: Finite mixture model. Wiley, New York (2000)

    Google Scholar 

  26. Meyn, S.P., Tweedie, R.L.: Markov chains and stochastic stability. Springer, London (1993)

    Google Scholar 

  27. Muthen, B., Shedden, K: Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics. 55, 463–469 (1999)

    Google Scholar 

  28. Oliver, D.S.: Multiple realization of the permeability field from well-test data. SPE J. 1(2), 145–155 (1996)

    Google Scholar 

  29. Oliver, D.: Metropolized randomized maximum likelihood for improved sampling from multimodal distributions. SIAM/ASA J. Uncertainty Quantification 5(1), 259–277 (2017)

    Google Scholar 

  30. Oliver, D.S., Chen, Y: Recent progress on reservoir history matching: a review. Comput. Geosci. 15(1), 185–211 (2011)

    Google Scholar 

  31. Oliver, D.S., Reynolds, A.C., Liu, N.: Inverse theory for petroleum reservoir characterization and history matching. Cambridge University Press, Cambridge (2008)

    Google Scholar 

  32. Oliver, D.S., Alfonzo, M: Calibration of imperfect models to biased observations. Comput. Geosci. 22(1), 145–161 (2018)

    Google Scholar 

  33. Rafiee, J., Reynolds, A.C.: A two-level MCMC based on the distributed Gauss-Newton method for uncertainty quantification. The 16th European Conference on the Mathematics of Oil Recovery, Barcelona, Spain, 3-6 September (2018)

  34. Sondergaard, T., Lermusiaux, P.F.: Data assimilation with Gaussian mixture models using the dynamically orthogonal field equations. Part I: Theory and scheme. Monthly Weather Review. 141(6), 1737–1760 (2013)

    Google Scholar 

  35. Sondergaard, T., Lermusiaux, P.F.: Data assimilation with Gaussian mixture models using the dynamically orthogonal field equations. Part II: applications. Monthly Weather Review. 141(6), 1737–1760 (2013)

    Google Scholar 

  36. Stordal, A.: Iterative Bayesian inversion with Gaussian mixtures: finite sample implementation and large sample asymptotics. Comput. Geosci. 19(1), 1–15 (2015)

    Google Scholar 

  37. Sun, W., Vink, J.C., Gao, G.: A practical method to mitigate spurious uncertainty reduction in history matching workflows with imperfect reservoir model. Paper SPE-182599-MS at the SPE Reservoir Simulation Conference held in Montgomery, TX, USA 20–22 February (2017)

  38. Sung, H.: Gaussian mixture regression and classification. Ph. D thesis, Rice University, Houston, Texas, USA (2004)

  39. Tarantola, A.: Inverse problem theory and methods for model parameter estimation. SIAM (2005)

  40. Yu, G., Sapiro, G., Mallat, S: Solving inverse problems with piecewise linear estimators: from Gaussian mixture models to structured sparsity. IEEE Trans. Image Process. 21(5), 2481–2499 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guohua Gao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Hao Jiang works for Memorial Hermann Health System now.

Appendices

Appendix A: Finite mixture models

A finite mixture model [25] with K components, p (x |Θ), is defined as

$$ p_{MM}\left( x\left| \mathrm{{\Theta}} \right. \right)=\sum\nolimits_{k\mathrm{=1}}^{K} {f^{(k)}p^{(k)}(x\mathrm{,}\theta^{(k)})} . $$
(1)

In Eq. 1, x is an n-dimensional random vector, p(k)(x,𝜃(k)) is the probability density function (PDF) with parameters 𝜃(k) for the k th mixture component with 1 ≤kK, and f(k) is the positive-valued mixture weight for the k th mixture component, representing the probability to generate a sample from this component. By definition, we have \({\sum }_{k\mathrm {=1}}^{K} f^{(k)} \mathrm {=1}\). The complete set of parameters to define a mixture model with K components is Θ = {f(1),f(2),…,f(K);𝜃(1),𝜃(2),…,𝜃(K)}.

If each of the K mixture components is a Gaussian with mean μ(k) and covariance matrix C(k), i.e.,

$$ \begin{array}{@{}rcl@{}} p^{(k)}\left( x, \theta^{(k)}\right)&=&\left( 2\pi \right)^{-n \left/2\right.}{\text{Det}}^{-1/2}\left( C^{(k)}\right)\\ &\times& \exp\left\{ -\frac{1}{2}\left[ x - \mu^{(k)} \right]^{T}\left[ C^{(k)} \right]^{-1}\left[ x - \mu^{(k)} \right] \right\}, \end{array} $$
(2)

then the finite mixture model defined in Eq. 1 becomes a Gaussian mixture model (GMM). In Eq. 2, 𝜃(k) = {μ(k),C(k)}.

Given a set of NT samples (or sets of training data points): \(T=\left \{ x^{\mathrm {(1)}}\mathrm {,}x^{\mathrm {(2)}}\mathrm {,\mathellipsis ,}x^{(N_{T})} \right \}\), which are assumed to be independent and identically distributed (i.i.d.). For given parameters Θ, the conditional probability of component k for a given sample x(i) is called “membership weight” and according to the Bayes theorem, it can be computed as

$$ w^{(i\mathrm{,}k)}=\frac{f^{(k)}p^{(k)}\left( x^{(i)}\mathrm{,}\theta ^{(k)}\right)}{\sum\nolimits_{j\mathrm{=1}}^{K} {f^{(j)}p^{(j)}\left( x^{(i)}\mathrm{,}\theta ^{(j)}\right)}} , \text{ for } \mathrm{1\le} k\mathrm{\le} K \text{ and } \mathrm{1\le} i\mathrm{\le} N_{T}. $$
(3)

In this expression, the f(k) is the (“prior”) probability for component k and the likelihood function, i.e., the probability (density) of the sample x(i) from this given component is p(k)(x(i),𝜃(k)). The denominator in Eq. 3 is the probability density for sample x(i), and hence, the probability density to generate the NT i.i.d. samples (for given parameters Θ) is

$$ p\left( x^{\left( 1 \right)},x^{\left( 2 \right)},\mathellipsis, x^{\left( N_{T} \right)}\left|{\Theta}\right.\right)=\prod\nolimits_{i=1}^{N_{T}}\left[\sum\nolimits_{j=1}^{K} {f^{(j)}p^{(j)}\left( x^{(i)},\theta^{(j)}\right)}\right]. $$
(4)

The maximum likelihood estimate of the parameters Θ can now be found using an iterative expectation and maximization (EM) algorithm.

Appendix B: Analytical derivatives of GMM with respect to unknown GMM parameters

To reduce the computational cost for GMM fitting, we need to apply a gradient-based optimization algorithm to minimize the error function, which requires calculation of the analytical gradient of the error function defined in Eq. ?? and/or calculation of the sensitivity matrix of the normalized shift defined in Eq. ??. By the chain rule, we have

$$ \mathrm{\nabla} G\left( {\Theta} \right)\mathrm{=-2}\sum\nolimits_{j\mathrm{=1}}^{N_{T}} {E\left( x^{(j)}\mathrm{;}{\Theta} \right)\mathrm{\nabla} p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)} . $$
(5)

The GMM approximation at a training data point x(j) is

$$ p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)=p_{\text{GMM}}\left( x^{(j)}\mathrm{,}{\Theta} \right)=\sum\nolimits_{l\mathrm{=1}}^{K} \left[ \beta_{G}^{\left( l \right)}p_{G}^{\left( l \right)}(x^{\left( j \right)}\mathrm{,}H^{(l)}) \right] , $$
(6)
$$ p_{G}^{\left( l \right)}\left( x^{\left( j \right)}\mathrm{,}H^{\left( l \right)} \right)=e^{-\frac{1}{2}\left\{ \left[ x^{\left( j \right)}-\mu^{(l)} \right]^{T}H^{(l)}\left[ x^{\left( j \right)}-\mu^{(l)} \right] \right\}}. $$
(7)

The derivatives of \(p_{\text {GMM}}^{\left (j \right )}\left ({\Theta } \right )\) with respect to w(l), \(\beta _{G}^{\left (l \right )}\), and v(l,k) (for l= 1, 2,…,K and k= 1, 2,…,m) are

$$ \frac{\partial p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)}{\partial \beta_{G}^{\left( l \right)}}=p_{G}^{\left( l \right)}(x^{\left( j \right)},H^{(l)}). $$
(8)
$$ \frac{\partial p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)}{\partial w^{(l)}}\mathrm{=-}\frac{1}{2}\beta_{G}^{\left( l \right)}p_{G}^{\left( l \right)}(x^{\left( j \right)},H^{(l)})\left[ x^{\left( j \right)}\! -\!\mu^{(l)} \right]^{T}H_{G0}^{(l)}\left[ x^{\left( j \right)}-\mu^{(l)} \right]. $$
(9)
$$ \mathrm{\nabla}_{v^{(l\mathrm{,}k)}}p_{\text{GMM}}^{\left( j \right)}\left( {\Theta} \right)\mathrm{=-}\beta_{G}^{\left( l \right)}p_{G}^{\left( l \right)}(x^{\left( j \right)},H^{(l)})\left[ x^{\left( j \right)}\! -\!\mu^{(l)} \right]^{T}v^{(l\mathrm{,}k)}\left[ x^{\left( j \right)}\! -\! \mu^{(l)} \right]. $$
(10)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, G., Jiang, H., Vink, J.C. et al. Gaussian mixture model fitting method for uncertainty quantification by conditioning to production data. Comput Geosci 24, 663–681 (2020). https://doi.org/10.1007/s10596-019-9823-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10596-019-9823-3

Keywords

Mathematics Subject Classification (2010)

Navigation