Abstract
Finite mixtures of (multivariate) Gaussian distributions have broad utility, including their usage for model-based clustering. There is increasing recognition of mixtures of asymmetric distributions as powerful alternatives to traditional mixtures of Gaussian and mixtures of t distributions. The present work contributes to that assertion by addressing some facets of estimation and inference for mixtures-of-gamma distributions, including in the context of model-based clustering. Maximum likelihood estimation of mixtures of gammas is performed using an expectation–conditional–maximization (ECM) algorithm. The Wilson–Hilferty normal approximation is employed as part of an effective starting value strategy for the ECM algorithm, as well as provides insight into an effective model-based clustering strategy. Inference regarding the appropriateness of a common-shape mixture-of-gammas distribution is motivated by theory from research on infant habituation. We provide extensive simulation results that demonstrate the strong performance of our routines as well as analyze two real data examples: an infant habituation dataset and a whole genome duplication dataset.
Similar content being viewed by others
Notes
We use the same random starting value strategy for the mixture-of-normals EM as discussed above for the mixture-of-gammas ECM, but now a normal distribution is fit to each of the k partitions. We do 5 such random initializations and retain the one with the best fit according to the final observed loglikelihood value.
The data are available at http://datadryad.org/resource/doi:10.5061/dryad.st3gt (Yang et al. 2018) and the script is available at https://github.com/tanghaibao/bio-pipeline/tree/master/synonymous_calculation.
References
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second international symposium on information theory. Akademiai, Budapest, pp 267–281
Al-Saleh JA, Agarwal SK (2007) Finite mixture of gamma distributions: a conjugate prior. Comput Stat Data Anal 51(9):4369–4378
Almhana J, Liu Z, Choulakian V, McGorman R (2006) A recursive algorithm for gamma mixture models. In: 2006 IEEE international conference on communications, vol 1, pp 197–202
Atapattu S, Tellambura C, Jiang H (2011) A mixture gamma distribution to model the SNR of wireless channels. IEEE Trans Wirel Commun 10(12):4193–4203
Baudry J-P, Celeux G (2015) EM for mixtures: initialization requires special care. Stat Comput 25(4):713–726
Benaglia T, Chauveau D, Hunter DR, Young DS (2009) mixtools: an R package for analyzing finite mixture models. J Stat Softw 32(6):1–29
Biernaki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3–4):561–575
Bochkina N, Rousseau J (2017) Adaptive density estimation based on a mixture of gammas. Electron J Stat 11:916–962
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26(2):211–252
Chen J, Kahlili A (2009) Order selection in finite mixture models with a nonsmooth penalty. J Am Stat Assoc 104(485):187–196
Chen H, Chen J, Kalbfleisch JD (2001) A modified likelihood ratio test for homogeneity in finite mixture models. J R Stat Soc Ser B 63(1):19–29
Clark JW, Donoghue PCJ (2017) Constraining the timing of whole genome duplication in plant evolutionary history. Proc R Soc B Biol Sci 284(20170912):1–8
Clark JW, Donoghue PCJ (2018) Whole-genome duplication and plant macroevolution. Trends Plant Sci 23(10):933–945
Colombo J, Mitchell DW (2009) Infant visual habituation. Neurobiol Learn Mem 92(2):225–234
Colombo J, Kapa L, Curtindale L (2011) Varieties of attention in infancy. In: Oakes LM, Cashon CH, Casasola M, Rakison DH (eds) Infant perception and cognition: recent advances, emerging theories, and future directions. Oxford University Press, New York, pp 3–26
Cutler A, Cordiero-Braña OI (1996) Minimum Hellinger distance estimation for finite mixture models. J Am Stat Assoc 91(436):1716–1723
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1–38
Dvorkin D (2012) lcmix: layered and chained mixture models. R package version 0.3/r5
Evin G, Merleau J, Perreault L (2011) Two-component mixtures of normal, gamma, and gumbel distributions for hydrological applications. Water Resour Res 47(8):1–21
Feng ZD, McCulloch CE (1996) Using bootstrap likelihood ratios in finite mixture models. J R Stat Soc Ser B 58(3):609–617
Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396
Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631
Fraley C, Raftery AE (2007) Bayesian regularization for normal mixture estimation and model-based clustering. J Classif 24(2):155–181
Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Sringer, New York
Gárcia-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36(3):1324–1345
Gárcia-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2015) Avoiding spurious local maximizers in mixture modeling. Stat Comput 25(3):619–633
Gilmore RO, Thomas H (2002) Examining individual differences in infants’ habituation patterns using objective quantitative techniques. Infant Behav Dev 25(3):399–412
Grün B, Leisch F (2008) Flexmix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35
Hood BM, Murray L, King F, Hooper R, Atkinson J, Braddick O (1996) Habituation changes in early infancy: longitudinal measures from birth to 6 months. J Reprod Infant Psychol 14(3):177–185
Huang W-J, Chang S-H (2007) On some characterizations of the mixture of gamma distributions. J Stat Plan Inference 137(9):2964–2974
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Ingrassia S (2004) A likelihood-based constrained algorithm for multivariate normal mixture models. Stat Methods Appl 13(2):151–166
John S (1970) On identifying the population of origin of each observation in a mixture of observations from two gamma populations. Technometrics 12(3):565–568
Karlis D, Xekalaki E (1998) Minimum Hellinger distance estimation for poisson mixtures. Comput Stat Data Anal 29(1):81–103
Karlis D, Xekalaki E (2003) Choosing initial values for the EM algorithm for finite mixtures. Comput Stat Data Anal 41(3–4):577–590
Kim D, Seo B (2014) Assessment of the number of components in Gaussian mixture models in the presence of multiple local maximizers. J Multivar Anal 125:100–120
Kotz S, Balakrishnan N, Johnson NL (2000) Continuous multivariate distributions, volume 1: models and applications, 2nd edn. Wiley, New York
Krishnamoorthy K, Mathew T, Mukherjee S (2008) Normal-based methods for a gamma distribution: prediction and tolerance intervals and stress-strength reliability. Technometrics 50(1):69–78
Krishnamoorthy K, Lee M, Xiao W (2015) Likelihood ratio tests for comparing several gamma distributions. Environmetrics 26(8):571–583
Lee SX, McLachlan GJ (2013) Model-based clustering and classification with non-normal mixture distributions. Stat Methods Appl 22(4):427–454
Li Z, Baniaga AE, Sessa EB, Scascitelli M, Graham SW, Rieseberg LH, Barker MS (2015) Early genome duplications in conifers and other seed plants. Sci Adv 1(10):1–8
Li H-C, Kyrlov VA, Fan P-Z, Zerubia J, Emery WJ (2016) Unsupervised learning of generalized gamma mixture model with application in statistical modeling of high-resolution SAR images. IEEE Trans Geosci Remote Sens 54(4):2153–2170
Lindsay BG (1994) Efficiency versus robustness: the case for minimum Hellinger distance estimation and related methods. Ann Stat 22(2):1081–1114
Lindsay BG (1995) Mixture models: theory, geometry and applications, volume 5 of NSF-CBMS regional conference series in probability and statistics. Institute of Mathematical Statistics and the American Statistical Association
Manly BFJ (1976) Exponential data transformations. J R Stat Soc Ser D (Stat) 25(1):37–42
Mathai AM, Moschopoulos PG (1992) A form of multivariate gamma distribution. Ann Inst Stat Math 44(1):97–106
Mayrose I, Friedman N, Pupko T (2005) A gamma mixture model better accounts for among site rate heterogeneity. Bioinformatics 21(2):151–158
McLachlan GJ (1987) On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Appl Stat 36(3):318–324
McLachlan GJ (1988) On the choice of starting values for the EM algorithm in fitting finite mixture models. J R Stat Soc Ser D 37(4/5):1988
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
McNicholas PD (2016) Mixture model-based classification. CRC Press, Boca Raton
Meng X-L, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80(2):267–278
Nei M, Gojobori T (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3(5):418–426
Nemec J, Linnell-Nemec AF (1991) Mixture models for studying stellar populations I. Univariate mixture models, parameter estimation, and the number of discrete population components. Publ Astron Soc Pac 103(659):95–121
Nielsen F (2012) K-MLE: a fast algorithm for learning statistical mixture models. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 869–872
Nwe TL, Nguyen TH, Ma B (2014) On the use of Bhattacharyya based GMM distance and neural net features for identification of cognitive load levels. In: INTERSPEECH 2014, 15th annual conference of the international speech communication association, pp 736–740
Pagel M, Meade A (2004) A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst Biol 53(4):571–581
Panchy N, Lehti-Shiu M, Shiu S-H (2016) Evolution of gene duplication in plants. Plant Physiol 171(4):2294–2316
R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Ruppert D (2001) Multivariate transformations. In: El-Shaarawi AH, Piegorsch WW (eds) Encyclopedia of environmetrics. Wiley, New York
Schwander O, Nielsen F (2013) Fast learning of gamma mixture models with \(k\)-mle. In: Handcock E, Pelilo M (eds) Similarity-based pattern recognition, vol 7953. Spinger, Berlin, pp 235–249
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Scrucca L, Fop M, Murphy TB, Raftery AE (2016) Mclust5: clustering, classification and density estimation using Gaussian finite mixture models. R J 8(1):289–317
Sfikas G, Constantinopoulos C, Likas A, Galatsanos NP (2005) An analytic distance metric for Gaussian mixture models with application in image retrieval. In: Duch W, Kacprzyk J, Oja E, Zadrożny S (eds) Artificial neural networks: formal models and their applications—ICANN 2005, vol 3697. Spinger, Berlin, pp 835–840
Slater A (1997) Can measures of infant habituation predict later intellectual ability? Arch Dis Child 77(6):474–476
Song PX-K (2000) Multivariate dispersion models generated from Gaussian copulas. Scand J Stat 27(2):305–320
Thomas H, Faßbender I (2017) Modeling infant \(i\)’s look on trial \(t\): race-face preference depends on \(i\)’s looking style. Front Psychol 8(1016):1–11
Thomas H, Hettmansperger TP (2001) Modelling change in cognitive understanding with finite mixtures. J R Stat Soc Ser C 50(4):435–448
Thomas H, Lohaus A, Domsch H (2011) Extensions of reliability theory. In: Hunter DR, Richards DSP, Rosenberger JL (eds) Nonparametric statistics and mixture models: a festschrift in Honor of Thomas P. Hettmansperger. World Scientific, Singapore, pp 309–316
Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, New York
Todd RT, Forche A, Selmecki A (2017) Ploidy variation in fungi: polyploidy, aneuploidy, and genome evolution. Microbiol Spect 5(4):1–31
Vaidyanathan VS, Vani Lakshmi R (2016) Estimation of parameters in a finite mixture of multivariate gamma distributions using Gaussian approximation. Sri Lankan J Appl Stat 17(3):187–200
Van de Peer Y, Fawcett JA, Proost S, Sterck L, Vandepoele K (2009) The flowering world: a tale of duplications. Trends Plant Sci 14(12):680–688
Van de Peer Y, Mizrachi E, Marchal K (2017) The evolutionary significance of polyploidy. Nat Rev Genet 18(7):411–424
Vani Lakshmi R, Vaidyanathan VS (2016) Parameter estimation in gamma mixture model using normal-based approximation. J Stat Theory Appl 15(1):25–35
Vanneste K, Van de Peer Y, Maere S (2013) Inference of genome duplications from age distributions revisited. Mol Biol Evol 30(1):177–190
Vardi Y, Shepp LA, Kaufman L (1985) A statistical model for positron emission tomography. J Am Stat Assoc 80(389):8–20
Venturini S, Dominici F, Parmigiani G (2008) Gamma shape mixtures for heavy-tailed distributions. Ann Appl Stat 2(2):756–776
Walker JF, Yang Y, Feng T, Timoneda A, Mikenas J, Hutchison V, Edwards C, Wang N, Ahluwalia S, Olivieri J, Walker-Hale N, Majure LC, Puente R, Kadereit G, Lauterbach M, Eggli U, Flores-Olvera H, Ochoterena H, Brockington SF, Moore MJ, Smith SA (2018) From cacti to carnivores: improved phylotranscriptomic sampling and hierarchical homology inference provide further insight into the evolution of caryophyllales. Am J Bot 105(3):446–462
Wilson EB, Hilferty MM (1931) The distribution of chi-square. Proc Natl Acad Sci 17(12):684–688
Wiper M, Insua DR, Ruggeri F (2001) Mixtures of gamma distributions with applications. J Comput Graph Stat 10(3):440–454
Woodward WA, Parr WC, Schucany WR, Lindsey H (1984) A comparison of minimum distance and maximum likelihood estimation of a mixture proportion. J Am Stat Assoc 79(387):590–598
Xu L, Jordan M (1996) On convergence properties of the EM algorithm for Gaussian mixtures. Neural Comput 8(1):129–151
Yang Y, Moore MJ, Brockington SF, Mikenas J, Olivieri J, Walker JF, Smith SA (2018) Improved transcriptome sampling pinpoints 26 ancient and more recent polyploidy events in caryophyllales, including two allopolyploidy events. New Phytol 217(2):855–870
Young DS, Hunter DR (2015) Random effects regression mixtures for analyzing infant habituation. J Appl Stat 42(7):1421–1441
Young DS, Ke C, Zeng X (2018) The mixturegram: a visualization tool for assessing the number of components in finite mixture models. J Comput Graph Stat 27(3):564–575
Acknowledgements
The authors thank the two anonymous referees who provided numerous helpful comments and suggestions which contributed to the improvement of this paper. We especially thank the one reviewer for pointing out the paper of Vaidyanathan and Vani Lakshmi (2016), which contributed to a fuller discussion about the novel components we introduced in our paper. We would also like to thank Hoben Thomas from the Department of Psychology, Pennsylvania State University, Arnold Lohaus from the Department of Psychology, University of Marburg, and the German Research Foundation (DFG), for providing the infant dataset. We would also like to extend additional gratitude to Hoben Thomas for insights provided about the infant habituation study. The work of R. Nilo-Poyanco is supported by Fondecyt Iniciación 11150107.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Young, D.S., Chen, X., Hewage, D.C. et al. Finite mixture-of-gamma distributions: estimation, inference, and model-based clustering. Adv Data Anal Classif 13, 1053–1082 (2019). https://doi.org/10.1007/s11634-019-00361-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-019-00361-y
Keywords
- ECM algorithms
- Finite mixture models
- Identifiability
- Mixturegram
- Multivariate Gaussian copula
- Starting values