Abstract
Censored data arise frequently in diverse applications in which observations to be measured may be subject to some upper and lower detection limits due to the restriction of experimental apparatus such that they are not exactly quantifiable. Mixtures of factor analyzers with censored data (MFAC) have been recently proposed for model-based density estimation and clustering of high-dimensional data in the presence of censored observations. In this paper, we consider an extended version of MFAC by considering regression equations to describe the relationship between covariates and multiply censored dependent variables. Two analytically feasible EM-type algorithms are developed for computing maximum likelihood estimates of model parameters with closed-form expressions. Moreover, we provide an information-based method to compute asymptotic standard errors of mixing proportions and regression coefficients. The utility and performance of our proposed methodology are illustrated through a simulation study and two real data examples.
Similar content being viewed by others
References
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: 2nd Int Symp on Information Theory. Akademiai Kiado, Budapest, pp. 267–281
Arellano-Valle RB, Castro LM, Genton MG, Gomez HW (2008) Bayesian inference for shape mixtures of skewed distributions, with application to regression analysis. Bayesian Anal 3(3):513–540
Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41:561–575
Castro LM, Costa DR, Prates MO, Lachos VH (2015) Likelihood-based inference for Tobit confirmatory factor analysis using the multivariate Student-\(t\) distribution. Stat Comput 25:1163–1183
Cosslett SR, Lee L-F (1985) Serial correlation in latent discrete variable models. J Econom 27:79–97
Costa DR, Lachos VH, Bazan JL, Azevedo CLN (2014) Estimation methods for multivariate Tobit confirmatory factor analysis. Comput Stat Data Anal 79:248–260
Dang UJ, McNicholas PD (2015) Families of parsimonious finite mixtures of regression models. In: Morlini I, Minerva T, Vichi M (eds) Advances in statistical models for data analysis. Studies in classification, data analysis, and knowledge organization. Springer International Publishing, Switzerland, pp 73–84
Dang UJ, Punzo A, McNicholas PD, Ingrassia S, Browne RP (2017) Multivariate response and parsimony for Gaussian cluster-weighted models. J Classif 34:4–34
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 9:1–38
Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1:54–75
Fraley C (1998) Algorithms for model-based Gaussian hierarchical clustering. SIAM J Sci Comput 20:270–281
García-Escudero LA, Gordaliza A, Matrán C (2003) Trimming tools in exploratory data analysis. J Comput Graph Stat 12:434–449
Ghahramani Z, Hinton GE (1997) The EM algorithm for factor analyzers. Technical report no. CRG-TR-96-1, The University of Toronto, Toronto
Hartigan JA, Wong MA (1979) Algorithm AS 136: A \(K\)-means clustering algorithm. J R Stat Soc C 28:100–108
He J (2013) Mixture model based multivariate statistical analysis of multiply censored environmental data. Adv Water Resour 59:15–24
Hoffman HJ, Johnson RE (2011) Estimation of multiple trace metal water contaminants in the presence of left-censored and missing data. J Environ Stat 2:1–16
Hoffman HJ, Johnson RE (2015) Pseudo-likelihood estimation of multivariate normal parameters in the presence of left-censored data. J Agric Biol Environ Stat 20:156–171
Karlis D, Xekalaki E (2003) Choosing initial values for the EM algorithm for finite mixtures. Comput Stat Data Anal 41:577–590
Karlsson M, Laitila T (2014) Finite mixture modeling of censored regression models. Stat Pap 55:627–642
Kaufman L, Rousseeuw P (2008) Finding groups in data: an introduction to cluster analysis. Wiley, Hoboken
Lin TI (2009) Maximum likelihood estimation for multivariate skew normal mixture models. J Multivar Anal 10:257–265
Lin TI, Lachos VH, Wang WL (2018) Multivariate longitudinal data analysis with censored and intermittent missing responses. Stat Med 37:2822–2835
Lin TI, Wang WL (2020) Multivariate-t linear mixed models with censored responses, intermittent missing values and heavy tails. Stat Meth Med Res 29:1288–1304
Liu RDK, Buffart LM, Kersten MJ, Spiering M, Brug J, Mechelen WV, Chinapaw MJM (2011) Psychometric properties of two physical activity questionnaires. The AQuAA and the PASE, in cancer patients. BMC Med Res Methodol. https://doi.org/10.1186/1471-2288-11-30
Liu M, Lin TI (2014) A skew-normal mixture regression model. Educ Psychol Meas 74:139–162
Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Statt Soc B 44:226–233
Maitra R (2009) Initializing partition-optimization algorithms. IEEE ACM Trans Comput Biol Bioinf 6:144–157
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Melnykov V, Maitra R (2010) Finite mixture models and model-based clustering. Stat Surv 4:80–116
Melnykov V, Melnykov I (2012) Initializing the EM algorithm in Gaussian mixture models with an unknown number of components. Comput Stat Data Anal 56:1381–1395
Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80:267–278
Meng XL, van Dyk D (1997) The EM algorithm-an old folk song sung to a fast new tune. J R Stat Soc Ser B 59:511–567
Punzo A, McNicholas PD (2017) Robust clustering in regression analysis via the contaminated Gaussian cluster-weighted model. J Classif 34:249–293
Pelleg D, Moore AW (2000) \(X\)-means: Extending \(K\)-means with efficient estimation of the number of clusters. In: Langley P (ed) ICML. Morgan Kaufmann, pp 727–734
Quandt RE, Ramsey JB (1978) Estimating mixtures of normal distributions and switching regressions. J Am Stat Assoc 73:730–738
Schwarz G (1978) Estimating the dimension of a model. Ann Statt 6:461–464
Scrucca L, Fop M, Murphy TB, Raftery AE (2016) mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J 8:205–233
Spearman C (1904) General intelligence, objectively determined and measured. Am J Psychol 15:201–293
Tobin J (1958) Estimation of relationships for limited dependent variables. Econometrica 26:24–36
Turner M (2000) Agricultural output, income and productivity. In: Collins EJT (ed) The agrarian history of England and Wales, vol VII. Cambridge University Press, Cambridge, pp 1850–1914
Vaida F, Fitzgerald A, DeGruttola V (2007) Efficient hybrid EM for linear and nonlinear mixed effectsmodels with censored response. Comput Stat Data Anal 51:5718–5730
Wang WL, Castro LM, Lachos VH, Lin TI (2019) Model-based clustering of censored data via mixtures of factor analyzers. Comput Stat Data Anal 140:104–121
Acknowledgements
We are grateful to the Editor, Associate Editor, and two referees for their valuable comments and suggestions on the earlier version of this paper. W.L. Wang and T.I. Lin would like to acknowledge the support of the Ministry of Science and Technology of Taiwan under Grant Nos. MOST 107-2628-M-035-001-MY3 and MOST 107-2118-M-005-002-MY2, respectively. L.M. Castro acknowledges support from Grant FONDECYT 1170258 and Millennium Science Initiative of the Ministry of Economy, Development and Tourism, Grant “Millenium Nucleus Center for the Discovery of Structures in Complex Data” from the Chilean government.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
A. Proof of Proposition 1
To evaluate the conditional expectation of the complete-data log-likelihood function given the observed data \((\varvec{v}_j,\varvec{c}_j)\) and the current estimates \(\hat{{\varvec{\varTheta }}}^{(k)}\) in the EM algorithm, we need the following conditional moment involving latent variables \(\varvec{y}_j^{\mathrm{c}}\) and \(\varvec{u}_{ij}\):
-
(a)
Making use of the fact that \(\varvec{y}_j=\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}}+\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}}\), we have
$$\begin{aligned} E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)= & {} E\{(\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}}+\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}})\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\= & {} \varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}} + \varvec{C}_j^\top E(\varvec{y}_j^{\mathrm{c}} \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1) \\= & {} \varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}} +\varvec{C}_j^\top E(\varvec{W}_{ij}^{\mathrm{c}}), \end{aligned}$$where \(\varvec{W}_{ij}^{\mathrm{c}}\) is a random vector following the distribution in (4).
-
(b)
Similarly,
$$\begin{aligned}&E(\varvec{y}_j \varvec{y}_j^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)\\&\quad = E\{(\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}}+\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}})(\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}}+\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}})^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad = E\{(\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}} \varvec{v}_j^{\mathrm{o^\top }} \varvec{O}_j+\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}} \varvec{y}_j^{\mathrm{c^\top }} \varvec{C}_j+\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}} \varvec{v}_j^{\mathrm{o^\top }} \varvec{O}_j\\&\qquad +\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}} \varvec{y}_j^{\mathrm{c^\top }} \varvec{C}_j) \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\} \\&\quad = \varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}} \varvec{v}_j^{\mathrm{o^{\top }}} \varvec{O}_j +\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}} E^\top (\varvec{W}_{ij}^{\mathrm{c}}) \varvec{C}_j +\varvec{C}_j^\top E(\varvec{W}_{ij}^{\mathrm{c}}) \varvec{v}_j^{\mathrm{o^{\top }}} \varvec{O}_j\\&\qquad +\varvec{C}_j^\top E(\varvec{W}_{ij}^{\mathrm{c}} \varvec{W}_{ij}^{\mathrm{c^{\top }}}) \varvec{C}_j. \end{aligned}$$ -
(c)
Using the law of iterated expectations, we calculate
$$\begin{aligned} E(\varvec{u}_{ij} \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1 )= & {} E\{E(\varvec{u}_{ij} \mid \varvec{y}_j,\varvec{v}_j,\varvec{c}_j)\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\= & {} E[\varvec{B}_i^\top {\varvec{\varSigma }}_i^{-1} (\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1]\\= & {} {\varvec{\varGamma }}_i^\top \{E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j)-\varvec{X}_j {\varvec{\beta }}_i\}, \end{aligned}$$where \({\varvec{\varGamma }}_i={\varvec{\varSigma }}_i^{-1}\varvec{B}_i\).
-
(d)
Likewise,
$$\begin{aligned}&E(\varvec{y}_j \varvec{u}_{ij}^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1 )\\&\quad = E\{E(\varvec{y}_j \varvec{u}_{ij}^\top \mid \varvec{y}_j,\varvec{v}_j,\varvec{c}_j,z_{ij}=1)\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad = E\{\varvec{y}_j(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)^\top {\varvec{\varSigma }}_i^{-1}\varvec{B}_i \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad = E\{(\varvec{y}_j \varvec{y}_j^\top -\varvec{y}_j {\varvec{\beta }}_i^\top \varvec{X}_j^\top ) \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}{\varvec{\varSigma }}_i^{-1}\varvec{B}_i\\&\quad = \{E(\varvec{y}_j \varvec{y}_j^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)-E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1){\varvec{\beta }}_i^\top \varvec{X}_j^\top \}{\varvec{\varGamma }}_i. \end{aligned}$$ -
(e)
A standard calculation gives
$$\begin{aligned}&E(\varvec{u}_{ij} \varvec{u}_{ij}^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)\\&\quad = E\{E(\varvec{u}_{ij} \varvec{u}_{ij}^\top \mid \varvec{y}_j,\varvec{v}_j,\varvec{c}_j,z_{ij}=1)\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad = E\{E(\varvec{u}_{ij} \mid \varvec{y}_j,\varvec{v}_j,\varvec{c}_j,z_{ij}=1)E(\varvec{u}_{ij}^\top \mid \varvec{y}_j,\varvec{v}_j,\varvec{c}_j,z_{ij}=1)\\&\qquad + \mathrm{cov}(\varvec{u}_{ij}\mid \varvec{y}_j,\varvec{v}_j,\varvec{c}_j,z_{ij}=1)\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad = E\{\varvec{B}_i^\top {\varvec{\varSigma }}_i^{-1}(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)^\top {\varvec{\varSigma }}_i^{-1} \varvec{B}_i\\&\qquad + (\varvec{I}_q-\varvec{B}_i^\top {\varvec{\varSigma }}_i^{-1}\varvec{B}_i) \mid \varvec{v}_j,\varvec{c}_j ,z_{ij}=1\}\\&\quad = {\varvec{\varGamma }}_i^\top \{E(\varvec{y}_j \varvec{y}_j^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)-E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1){\varvec{\beta }}_i^\top \varvec{X}_j^\top \\&\qquad -\varvec{X}_j {\varvec{\beta }}_i E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)^\top +\varvec{X}_j {\varvec{\beta }}_i{\varvec{\beta }}_i^\top \varvec{X}_j^\top \}{\varvec{\varGamma }}_i+ {\varvec{\varOmega }}_i, \end{aligned}$$where \({\varvec{\varOmega }}_i=\varvec{I}_q-\varvec{B}_i^\top {\varvec{\varSigma }}_i^{-1} \varvec{B}_i\).
B. Proof of Proposition 2
We use Proposition 2 to solve the following conditional expectations in the AECM algorithm:
-
(a)
Making use of the fact that \(\varvec{y}_j=\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}}+\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}}\), we have
$$\begin{aligned} \mathrm{cov}(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)= & {} \mathrm{cov}[(\varvec{O}_j^\top \varvec{v}_j^{{\mathrm{o}}}+\varvec{C}_j^\top \varvec{y}_j^{\mathrm{c}}) \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1]\\= & {} \varvec{C}_j^\top \mathrm{cov}(\varvec{W}_{ij}^{\mathrm{c}})\varvec{C}_j \triangleq {\varvec{\varLambda }}_{ij}. \end{aligned}$$ -
(b)
A standard calculation gives
$$\begin{aligned}&E\{(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\} \\&\quad =E\{(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}E\{(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}^\top \\&\qquad +\mathrm{cov}(\varvec{y}_j\mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)\\&\quad = \{E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)-\varvec{X}_j {\varvec{\beta }}_i\}\{E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)-\varvec{X}_j {\varvec{\beta }}_i\}^\top +{\varvec{\varLambda }}_{ij}. \end{aligned}$$ -
(c)
By the law of iterated expectations, we calculate
$$\begin{aligned}&E\{(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)\varvec{u}_{ij}^\top \varvec{B}_i^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad =E\{(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)E(\varvec{u}_{ij}^\top \mid \varvec{y}_j)\varvec{B}_i^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad =E\{(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}{\varvec{\varGamma }}_i \varvec{B}_i^\top . \end{aligned}$$ -
(d)
A standard calculation gives
$$\begin{aligned}&E(\varvec{B}_i \varvec{u}_{ij} \varvec{u}_{ij}^\top \varvec{B}_i^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1) \\&\quad =E\{\varvec{B}_i E(\varvec{u}_{ij} \varvec{u}_{ij}^\top \mid \varvec{y}_j )\varvec{B}_i^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1\}\\&\quad =E[\varvec{B}_i \{E(\varvec{u}_{ij} \mid \varvec{y}_j)E(\varvec{u}_{ij}^\top \mid \varvec{y}_j )+\mathrm{cov}(\varvec{u}_{ij}\mid \varvec{y}_j)\}\varvec{B}_i^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1] \\&\quad =E[\varvec{B}_i\{{\varvec{\varGamma }}_i^\top (\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)(\varvec{y}_j-\varvec{X}_j {\varvec{\beta }}_i)^\top {\varvec{\varGamma }}_i+{\varvec{\varOmega }}_i\}\varvec{B}_i^\top \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1]\\&\quad =\varvec{B}_i( {\varvec{\varGamma }}_i^\top [\{E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)-\varvec{X}_j {\varvec{\beta }}_i\}\\&\qquad \times \{E(\varvec{y}_j \mid \varvec{v}_j,\varvec{c}_j,z_{ij}=1)-\varvec{X}_j {\varvec{\beta }}_i\}^\top +{\varvec{\varLambda }}_{ij}]{\varvec{\varGamma }}_i+{\varvec{\varOmega }}_i ) \varvec{B}_i^\top . \end{aligned}$$
Rights and permissions
About this article
Cite this article
Wang, WL., Castro, L.M., Hsieh, WC. et al. Mixtures of factor analyzers with covariates for modeling multiply censored dependent variables. Stat Papers 62, 2119–2145 (2021). https://doi.org/10.1007/s00362-020-01177-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-020-01177-1