Goodness of fit for models with intractable likelihood

Cabras, Stefano; Castellanos, María Eugenia; Ratmann, Oliver

doi:10.1007/s11749-020-00747-7

Goodness of fit for models with intractable likelihood

Original Paper
Published: 02 January 2021

Volume 30, pages 713–736, (2021)
Cite this article

TEST Aims and scope Submit manuscript

Stefano Cabras ORCID: orcid.org/0000-0001-6690-8378¹,
María Eugenia Castellanos^2,3 &
Oliver Ratmann⁴

321 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Routine goodness-of-fit analyses of complex models with intractable likelihoods are hampered by a lack of computationally tractable diagnostic measures with well-understood frequency properties, that is, with a known sampling distribution. This frustrates the ability to assess the extremity of the data relative to fitted simulation models in terms of pre-specified test statistics, an essential requirement for model improvement. Given an Approximate Bayesian Computation setting for a posited model with an intractable likelihood for which it is possible to simulate from them, we present a general and computationally inexpensive Monte Carlo framework for obtaining \(p\)-valuesthat are asymptotically uniformly distributed in [0, 1] under the posited model when assumptions about the asymptotic equivalence between the conditional statistic and the maximum likelihood estimator hold. The proposed framework follows almost directly from the conditional predictive p-value proposed in the Bayesian literature. Numerical investigations demonstrate favorable power properties in detecting actual model discrepancies relative to other diagnostic approaches. We illustrate the technique on analytically tractable examples and on a complex tuberculosis transmission model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Methods for Calibrating Health Policy Models: A Tutorial

Article 28 February 2017

Nicolas A. Menzies, Djøra I. Soeteman, … Jane J. Kim

Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling

A Bayesian Superpopulation Approach to Inference for Finite Populations Based on Imperfect Diagnostic Outcomes

Article 05 November 2015

Geoffrey Jones & Wesley O. Johnson

References

Aandahl RZ, Stadler T, Sisson SA, Tanaka MM (2014) Exact vs. approximate computation: reconciling different estimates of mycobacterium tuberculosis epidemiological parameters. Genetics 196(4):1227–1230
Article Google Scholar
Barnes CP, Silk D, Sheng X, Stumpf MP (2011) Bayesian design of synthetic biological systems. Proc Nat Acad Sci 108(37):15190–15195
Article Google Scholar
Barnes CP, Filippi S, Stumpf MP, Thorne T (2012a) Considerate approaches to constructing summary statistics for ABC model selection. Stat Comput 22(6):1181–1197
Article MathSciNet MATH Google Scholar
Barnes CP, Filippi S, Stumpf MP, Thorne T (2012b) Considerate approaches to constructing summary statistics for abc model selection. Stat Comput 22(6):1181–1197
Article MathSciNet MATH Google Scholar
Bayarri MJ, Berger JO (1997) Measures of surprise in bayesian analysis. ISDS Discussion Paper, Duke University, Technical report
Bayarri MJ, Berger JO (2000) P values for composite null models. J Am Stat Assoc 95(452):1127–1142. https://doi.org/10.1080/01621459.2000.10474309
Article MathSciNet MATH Google Scholar
Bayarri MJ, Castellanos ME (2001) A comparison between p-values for goodness-of-fit checking. In: George EI (ed) Monographs of official statistics bayesian methods with applications to science. Policy and Official Statistics 1, pp 1–10
Bayarri MJ, Castellanos ME (2007) Bayesian checking of the second levels of hierarchical models. Stat Sci 22(3):322–343
MathSciNet MATH Google Scholar
Beaumont MA, Zhang W, Balding DJ (2002) Approximate bayesian computation in population genetics. Genetics 162(4):2025–2035
Article Google Scholar
Beaumont MA, Cornuet JM, Marin JM, Robert CP (2010) Adaptivity for ABC algorithms: the ABC-PMC scheme. Biometrika 96(4):983–990
Article Google Scholar
Becquet C, Przeworski M (2007) A new approach to estimate parameters of speciation models with application to apes. Genome Res 17(10):1505–1519
Article Google Scholar
Berger JO, Delampady M (1987) Testing precise hypotheses. Stat Sci 3:317–352
MathSciNet MATH Google Scholar
Berger JO, Sellke T (1987) Testing a point null hypothesis: the irreconcilability of \(p\)-value and evidence. J Am Stat Assoc 82:112–122
MathSciNet MATH Google Scholar
Bertolino F, Racugno W (1997) Is the intrinsic bayes factor intrinsic. Metron 54:5–15
MATH Google Scholar
Box GEP (1976) Science and statistics. J Am Stat Ass 71(356):791–799
Article MathSciNet MATH Google Scholar
Box GEP (1980) Sampling and bayes’ inference in scientific modelling and robustness. J R Stat Soc Ser A (General) 143(4):383–430
Article MathSciNet MATH Google Scholar
Cressie N (2015) Statistics for spatial data. Wiley, New York
MATH Google Scholar
Csilléry K, Blum MG, Gaggiotti OE, François O (2010) Approximate Bayesian computation (ABC) in practice. Trends Ecol Evol 25(7):410–418
Article Google Scholar
D’Agostino RB (1986) Goodness-of-fit-techniques, vol 68. CRC Press, Cambridge
MATH Google Scholar
Doksum KA, Lo AY (1990) Consistent and robust Bayes procedures for location based on partial information. Ann Stat 18:443–453
Article MathSciNet MATH Google Scholar
Fearnhead P, Prangle D (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. J Roy Stat Soc B (Methodological) 74(3):419–474
Article MathSciNet MATH Google Scholar
Fisher RA (1925) Statistical methods for research workers. Genesis Publishing Pvt Ltd, Delhi
MATH Google Scholar
Fraser D, Rousseau J (2008) Studentization and deriving accurate p-values. Biometrika 95(1):1–16
Article MathSciNet MATH Google Scholar
Frazier DT, Robert CP, Rousseau J (2017) Model misspecification in abc: Consequences and diagnostics. arXiv:1708.01974
Gelman A, Meng XL, Stern H (1996) Posterior predictive assessment of model fitness via realized discrepancies. Stat Sin 6:773–807
MathSciNet MATH Google Scholar
Gneiting T, Raftery AE (2005) Weather forecasting with ensemble methods. Science 310(5746):248–249
Article Google Scholar
Gouriéroux C, Monfort A, Renault E (1993) Indirect inference. J Appl. Econom 8:S85–118
Google Scholar
Granich RM, Gilks CF, Dye C, De Cock KM, Williams BG (2009) Universal voluntary HIV testing with immediate antiretroviral therapy as a strategy for elimination of HIV transmission: a mathematical model. Lancet 373(9657):48–57
Article Google Scholar
Guttman I (1967) The use of the concept of a future observation in goodness-of-fit problems. J R Stat Soc: Ser B (Methodol) 29:83–100
MathSciNet MATH Google Scholar
Hickerson MJ, Meyer CP (2008) Testing comparative phylogeographic models of marine vicariance and dispersal using a hierarchical Bayesian approach. BMC Evol Biol 8(1):322
Article Google Scholar
Hjort NL, Dahl FA, Steinbakk GH (2006) Post-processing posterior predictive p values. J Am Stat Assoc 101(475):1157–1174
Article MathSciNet MATH Google Scholar
Huber-Carol C, Balakrishnan N, Nikulin M, Mesbah M (2012) Goodness-of-fit tests and model validity. Springer, Berlin
MATH Google Scholar
Jasra A, Singh SS, Martin JS, McCoy E (2012) Filtering via approximate Bayesian computation. Stat Comput 22(6):1223–1237
Article MathSciNet MATH Google Scholar
Jiang B, Wu Ty, Zheng C, Wong WH (2015) Learning summary statistic for approximate bayesian computation via deep neural network. ArXiv e-prints arXiv:1510.02175
Johnson VE (2004) A bayesian \(\chi ^2\) test for goodness-of-fit. Ann Stat 32(6):2361–2384. https://doi.org/10.1214/009053604000000616
Article MathSciNet MATH Google Scholar
Johnson VE (2007) Bayesian model assessment using pivotal quantities. Bayesian Analysis 2(4):719–733
Article MathSciNet MATH Google Scholar
Lemaire L, Jay F, Lee IH, Csilléry K, Blum MGB (2016) Goodness-of-fit statistics for approximate bayesian computation. Technical report, arXiv:1601.04096
Liepe J, Taylor H, Barnes CP, Huvet M, Bugeon L, Thorne T, Lamb JR, Dallman MJ, Stumpf MP (2012) Calibrating spatio-temporal models of leukocyte dynamics against in vivo live-imaging data using approximate bayesian computation. Integr Biol 4(3):335–345
Article Google Scholar
Lintusaari J, Gutmann MU, Dutta R, Kaski S, Corander J (2017) Fundamentals and recent developments in approximate bayesian computation. Syst Biol 66(1):e66–e82
Google Scholar
Marjoram P, Molitor J, Plagnol V, Tavare S (2003) Markov chain Monte Carlo without likelihoods. Proc Nat Acad Sci USA 100:15324–8
Article Google Scholar
Meng XL (1994) Posterior predictive p-values. Ann Stat 22(3):1142–1160
Article MathSciNet MATH Google Scholar
Norris JR, Allen RJ, Evan AT, Zelinka MD, O’Dell CW, Klein SA (2016) Evidence for climate change in the satellite cloud record. Nature. https://doi.org/10.1038/nature18273
Article Google Scholar
Poon AF (2015) Phylodynamic inference with kernel ABC and its application to HIV epidemiology. Mol Biol Evol 32(9):2483–95
Article Google Scholar
Prangle D (2015) Summary statistics in approximate bayesian computation. arXiv preprint arXiv:1512.05633
Ratmann O, Andrieu C, Wiuf C, Richardson S (2009) Model criticism based on likelihood-free inference, with an application to protein network evolution. Proc Natl Acad Sci USA 106(26):10576–10581
Article Google Scholar
Robert C, Rousseau J (2002) A mixture approach to bayesian goodness of fit. Technical Report 9, Cahiers du CEREMADE
Robins JM, van der Vaart A, Ventura V (2000) Asymptotic distribution of p values in composite null models. J Am Stat Assoc 95(452):1143–1156
MathSciNet MATH Google Scholar
Rubin DB et al (1984) Bayesianly justifiable and relevant frequency calculations for the applies statistician. Ann Stat 12(4):1151–1172
Article MATH Google Scholar
Silk D, Filippi S, Stumpf MP (2013) Optimizing threshold-schedules for sequential approximate bayesian computation: applications to molecular systems. Stat Appl Genet Mol Biol 12(5):603–618
Article MathSciNet Google Scholar
Sisson SA, Fan Y, Beaumont M (eds) (2017) Handbook of approximate Bayesian computation. Taylor & Francis, New York
Google Scholar
Sisson SA, Fan Y, Beaumont M (2018) Handbook of approximate bayesian computation. Chapman and Hall/CRC, New York
Small PM, Hopewell PC, Singh SP, Paz A, Parsonnet J, Ruston DC, Schecter GF, Daley CL, Schoolnik GK (1994) The epidemiology of tuberculosis in San Francisco—a population-based study using conventional and molecular methods. N Engl J Med 330(24):1703–1709
Article Google Scholar
Stadler T (2011) Inferring epidemiological parameters on the basis of allele frequencies. Genetics 188(3):663–672
Article Google Scholar
Stein M (1987) Large sample properties of simulations using Latin hypercube sampling. Technometrics 29(2):143–151
Article MathSciNet MATH Google Scholar
Tanaka MM, Francis AR, Luciani F, Sisson SA (2006) Using approximate bayesian computation to estimate tuberculosis transmission parameters from genotype data. Genetics 173:1511–1520
Article Google Scholar
Wegmann D, Leuenberger C, Excoffier L (2009) efficient approximate bayesian computation coupled with Markov chain Monte Carlo without likelihood. Genetics 182(4):1207–1218
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Universidad Carlos III de Madrid, Madrid, Spain
Stefano Cabras
Department of Informatics and Statistics, Universidad Rey Juan Carlos, Madrid, Spain
María Eugenia Castellanos
Department of Economics, Università degli Studi di Cagliari, Cagliari, Italy
María Eugenia Castellanos
Imperial College London, London, UK
Oliver Ratmann

Authors

Stefano Cabras
View author publications
You can also search for this author in PubMed Google Scholar
María Eugenia Castellanos
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Ratmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefano Cabras.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Authors have been founded by MINECO-Spain projects PID2019-104790GB-I00 (M.E. Castellanos and S. Cabras) and Wellcome Trust fellowship WR092311MF (O. Ratmann).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cabras, S., Castellanos, M.E. & Ratmann, O. Goodness of fit for models with intractable likelihood. TEST 30, 713–736 (2021). https://doi.org/10.1007/s11749-020-00747-7

Download citation

Received: 10 December 2018
Accepted: 27 November 2020
Published: 02 January 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11749-020-00747-7

Keywords

Mathematics Subject Classification

62F15

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Goodness of fit for models with intractable likelihood

Abstract

Access this article

Similar content being viewed by others

Bayesian Methods for Calibrating Health Policy Models: A Tutorial

Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling

A Bayesian Superpopulation Approach to Inference for Finite Populations Based on Imperfect Diagnostic Outcomes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Goodness of fit for models with intractable likelihood

Abstract

Access this article

Similar content being viewed by others

Bayesian Methods for Calibrating Health Policy Models: A Tutorial

Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling

A Bayesian Superpopulation Approach to Inference for Finite Populations Based on Imperfect Diagnostic Outcomes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation