Abstract
In this paper, a zero-and-one-inflated Poisson (ZOIP) regression model is proposed. The maximum likelihood estimation (MLE) and Bayesian estimation for this model are investigated. Three estimation methods of the ZOIP regression model are obtained based on data augmentation method which is expectation-maximization (EM) algorithm, generalized expectation-maximization (GEM) algorithm and Gibbs sampling respectively. A simulation study is conducted to assess the performance of the proposed estimation for various sample sizes. Finally, an accidental deaths data set is analyzed to illustrate the practicability of the proposed method.
Similar content being viewed by others
References
Agarwal DK, Gelfand AE, Citron-Pousty S (2002) Zero-inflated models with application to spatial count data. Environ Ecol Stat 9:341–355
Akaike H, Gelfand AE, Citron-Pousty S (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723
Chen XD (2009) Bayesian analysis of semiparametric mixed-effects models for zero-inflated count data. Commun Stat Theory Methods 38:1815–1833
Dagne GA (2010) Bayesian semiparametric zero-inflated Poisson model for longitudinal count data. Math Biosci 224:126–130
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39:1–38
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Gelfand AE, Smith AFM (1990) Sampling based approaches to calculating marginal densities. J Am Stat Assoc 51:398–409
Gelman A, Carlin JB, Stern HS, Dunson BD, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. CRC Press, Boca Raton
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
George C, Edward IG (1992) Explaining the Gibbs sampler. Am Stat 46:167–174
Ghosh SK, Mukhopadhyay P, Lu JC (2006) Bayesian analysis of zero-inflated regression models. J Stat Plann Inference 136:1360–1375
Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41:337–348
Hunter DR, Lange K (2000) Quantile regression via an MM Algorithm. J Comput Gr Stat 9:60–77
Jansakul N, Hinde JP (2002) Score tests for Zero-In(ated Poisson models. Comput Stat Data Anal 40:75–96
Jung BC, Jhun M, Lee JW (2005) Bootstrap tests for overdispersion in a zero-inflated Poisson regression model. Biometrics 61:626–629
Khoshgoftaar TM, Gao K, Szabo RM (2005) Comparing software fault predictions of pure and zero-inflated Poisson regression models. Int J Syst Sci 36:705–715
Lambert D (1992) Zero-infated Poisson regression with application to defects in manufacturing. Technometrics 34:1–14
Lange K, Hunter DR, Yang I (2000) Optimization transfer using surrogate objective functions. J Comput Gr Stat 9:1–59
Liu WC, Tang YC, Xu AC (2018) A zero-and-one inflated Poisson model and its application. Stat Its Interface 11:339–351
Lim HK, Li WK, Yu PL (2014) Zero-inflated Poisson regression mixture model. Comput Stat Data Anal 71:151–158
Long DL, Preisser JS, Herring AH, Golin CE (2014) Zero-infated Poisson regression with application to defects in manufacturing. Stat Med 33:5151–5165
Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B 44:226–233
Mullahy J (1986) Specification and testing of some modified count data models. J Econom 33:341–365
Musio M, Sauleau EA, Buemi A (2010) Bayesian semiparametric ZIP models with space-time interactions: an application to cancer registry data. Math Med Biol 27:181–194
Neelon B, Chung DJ (2017) The LZIP: a Bayesian latent factor model for correlated zero-inflated counts. Biometrics 73:185–196
Ridout J, Hinde J, Demetrio GB (2001) A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics 57:219–223
Spiegelhalter DJ, Best NJ, Carlin BP (2002) Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B 64:583–639
Tang YC, Liu WC, Xu AC (2017) Statistical inference for zero-and-one-inflated Poisson models. Stat Theory Relat Fields 1:216–226
Tierney L (1994) Markov chains for exploring posterior distributions (with discussions). Ann Stat 26:1701–1762
Vandenbroek J (1995) A score test for zero inflation in a Poisson distribution. Biometrics 51:738–743
Wit E, Edwin vdh, Jan-Willem R (2012) All models are wrong...: an introduction to model uncertainty. Stat Neerl 66:217–236
Wu CFJ (1983) On the convergence properties of the EM Algorithm. Ann Stat 11:95–103
Xie FC, Lin JG, Wei BC (2014) Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics. J Appl Stat 41:1383–1392
Xu HY, Xie M, Goh TN (2014) Objective Bayes analysis of zero-inflated Poisson distribution with application to healthcare data. IIE Trans 46:843–852
Zhang C, Tian G-L, Ng K-W (2016) Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods. Stat Its Interface 9:11–32
Zhu H, Luo S, DeSantis SM (2017) Zero-inflated count models for longitudinal measurements with heterogeneous random effects. Stat Methods Med Res 26:1774–1786
Acknowledgements
The research is supported by the Natural Science Foundation of China (Nos. 11271136, 81530086, 11671303, 11201345, 11671303), the 111 Project of China (No. B14019), the Natural Science Foundation of Zhejiang Province (No. LY15G010006) and the China Postdoctoral Science Foundation (No. 2015M572598).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Appendix A.1
Proof of Theorem 3.1
The proof depends on the smoothness of \(Q(\varvec{\eta },\varvec{\eta }^{(k)}) = E[\ell _{c}(\varvec{\eta }|{\varvec{Y}}, {\varvec{B}}_{1},{\varvec{B}}_{2})|{\varvec{Y}}, \varvec{\eta }^{(k)}]\). We rewrite
where
and
\(\square \)
Simple calculation yields
which is continuous in \(\varvec{\eta }\) and \(\varvec{\eta }^{(k)}\). A similar conclusion can be applied to \(Q_{2}(\varvec{\eta },\varvec{\eta }^{(k)})\) and \(Q_{3}(\varvec{\eta },\varvec{\eta }^{(k)})\). It is easy to find that the Eq. (10) in Wu (1983) for \(Q(\varvec{\eta },\varvec{\eta }^{(k)})\) is correct. According to the GEM algorithm, it is easy to that
and
respectively. And according to Eqs. (13), (14), and (15), it is easy to find that \(||\varvec{\eta }^{(k+1)}-\varvec{\eta }^{(k)}||\rightarrow 0\) as \(k \rightarrow \infty \). So according to Theorem 2 and Theorem 5 in Wu (1983), the conclusion is obvious.
1.2 Appendix A.2
The full conditional distribution of \({\varvec{\beta }}, {\varvec{\gamma }}_1\) and \({\varvec{\gamma }}_2\) with Normal prior. After the data augmentation step, we have the joint posterior of \({\varvec{\beta }}\), \({\varvec{\gamma }}_{1}\) and \({\varvec{\gamma }}_{2}\) as follows:
where the \(\Pi ({\varvec{\beta }},{\varvec{\gamma }}_{1},{\varvec{\gamma }}_{2})\) is the prior of the parameters \({\varvec{\beta }}, {\varvec{\gamma }}_{1}\) and \({\varvec{\gamma }}_{2}\). Let \(N({\varvec{\beta }}_{0},\sigma _{\beta }{\varvec{I}}_{q})\), \(N({\varvec{\gamma }}_{01},\sigma _{\gamma _{1}}{\varvec{I}}_{r1})\) and \(N({\varvec{\gamma }}_{02},\sigma _{\gamma _{2}}{\varvec{I}}_{r2})\) be the priors for parameters \({\varvec{\beta }}, {\varvec{\gamma }}_{1}\) and \({\varvec{\gamma }}_{2}\) respectively and assume that they are mutually independent. The full conditional distributions of \({\varvec{\beta }}\), \({\varvec{\gamma }}_{1}\) and \({\varvec{\gamma }}_{2}\) are not standard distributions. Their densities are as follows:
and
It is also easy to get the second order partial derivative of \({\pi [{\varvec{\beta }}|\text {rest},{\varvec{Y}}]}\),
So \({\pi [{\varvec{\beta }}|\text {rest},{\varvec{Y}}]}\) are log-concave. The log-concave property of conditional densities \({\pi [{\varvec{\gamma }}_{1}|\text {rest},{\varvec{Y}}]}\) and \({\pi [{\varvec{\gamma }}_{2}|\text {rest},{\varvec{Y}}]}\) can be proved similarly. So ARS can be used to sample \({\varvec{\beta }}\), \({\varvec{\gamma }}_{1}\) and \({\varvec{\gamma }}_{2}\) from their respective full conditional distributions.
Rights and permissions
About this article
Cite this article
Liu, W., Tang, Y. & Xu, A. Zero-and-one-inflated Poisson regression model. Stat Papers 62, 915–934 (2021). https://doi.org/10.1007/s00362-019-01118-7
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-019-01118-7