Zero-and-one-inflated Poisson regression model

Liu, Wenchen; Tang, Yincai; Xu, Ancha

doi:10.1007/s00362-019-01118-7

Zero-and-one-inflated Poisson regression model

Regular Article
Published: 28 June 2019

Volume 62, pages 915–934, (2021)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Wenchen Liu¹,
Yincai Tang¹ &
Ancha Xu²

1480 Accesses
7 Citations
2 Altmetric
Explore all metrics

Abstract

In this paper, a zero-and-one-inflated Poisson (ZOIP) regression model is proposed. The maximum likelihood estimation (MLE) and Bayesian estimation for this model are investigated. Three estimation methods of the ZOIP regression model are obtained based on data augmentation method which is expectation-maximization (EM) algorithm, generalized expectation-maximization (GEM) algorithm and Gibbs sampling respectively. A simulation study is conducted to assess the performance of the proposed estimation for various sample sizes. Finally, an accidental deaths data set is analyzed to illustrate the practicability of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Article Open access 19 December 2014

Xiang Wan, Wenqian Wang, … Tiejun Tong

A tutorial on Bayesian multi-model linear regression with BAS and JASP

Article Open access 09 April 2021

Don van den Bergh, Merlise A. Clyde, … Eric-Jan Wagenmakers

One-Inflated Zero-Truncated Poisson Distribution: Statistical Properties and Real Life Applications

Article 17 April 2024

Mohammad Kafeel Wani & Peer Bilal Ahmad

References

Agarwal DK, Gelfand AE, Citron-Pousty S (2002) Zero-inflated models with application to spatial count data. Environ Ecol Stat 9:341–355
Article MathSciNet Google Scholar
Akaike H, Gelfand AE, Citron-Pousty S (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723
Article MathSciNet Google Scholar
Chen XD (2009) Bayesian analysis of semiparametric mixed-effects models for zero-inflated count data. Commun Stat Theory Methods 38:1815–1833
Article MathSciNet Google Scholar
Dagne GA (2010) Bayesian semiparametric zero-inflated Poisson model for longitudinal count data. Math Biosci 224:126–130
Article MathSciNet Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39:1–38
MATH Google Scholar
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Article MathSciNet Google Scholar
Gelfand AE, Smith AFM (1990) Sampling based approaches to calculating marginal densities. J Am Stat Assoc 51:398–409
Article MathSciNet Google Scholar
Gelman A, Carlin JB, Stern HS, Dunson BD, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. CRC Press, Boca Raton
Book Google Scholar
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
Article Google Scholar
George C, Edward IG (1992) Explaining the Gibbs sampler. Am Stat 46:167–174
MathSciNet Google Scholar
Ghosh SK, Mukhopadhyay P, Lu JC (2006) Bayesian analysis of zero-inflated regression models. J Stat Plann Inference 136:1360–1375
Article MathSciNet Google Scholar
Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41:337–348
Article Google Scholar
Hunter DR, Lange K (2000) Quantile regression via an MM Algorithm. J Comput Gr Stat 9:60–77
MathSciNet Google Scholar
Jansakul N, Hinde JP (2002) Score tests for Zero-In(ated Poisson models. Comput Stat Data Anal 40:75–96
Article Google Scholar
Jung BC, Jhun M, Lee JW (2005) Bootstrap tests for overdispersion in a zero-inflated Poisson regression model. Biometrics 61:626–629
Article MathSciNet Google Scholar
Khoshgoftaar TM, Gao K, Szabo RM (2005) Comparing software fault predictions of pure and zero-inflated Poisson regression models. Int J Syst Sci 36:705–715
Article MathSciNet Google Scholar
Lambert D (1992) Zero-infated Poisson regression with application to defects in manufacturing. Technometrics 34:1–14
Article Google Scholar
Lange K, Hunter DR, Yang I (2000) Optimization transfer using surrogate objective functions. J Comput Gr Stat 9:1–59
MathSciNet Google Scholar
Liu WC, Tang YC, Xu AC (2018) A zero-and-one inflated Poisson model and its application. Stat Its Interface 11:339–351
Article MathSciNet Google Scholar
Lim HK, Li WK, Yu PL (2014) Zero-inflated Poisson regression mixture model. Comput Stat Data Anal 71:151–158
Article MathSciNet Google Scholar
Long DL, Preisser JS, Herring AH, Golin CE (2014) Zero-infated Poisson regression with application to defects in manufacturing. Stat Med 33:5151–5165
Article MathSciNet Google Scholar
Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B 44:226–233
MathSciNet MATH Google Scholar
Mullahy J (1986) Specification and testing of some modified count data models. J Econom 33:341–365
Article MathSciNet Google Scholar
Musio M, Sauleau EA, Buemi A (2010) Bayesian semiparametric ZIP models with space-time interactions: an application to cancer registry data. Math Med Biol 27:181–194
Article MathSciNet Google Scholar
Neelon B, Chung DJ (2017) The LZIP: a Bayesian latent factor model for correlated zero-inflated counts. Biometrics 73:185–196
Article MathSciNet Google Scholar
Ridout J, Hinde J, Demetrio GB (2001) A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics 57:219–223
Article MathSciNet Google Scholar
Spiegelhalter DJ, Best NJ, Carlin BP (2002) Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B 64:583–639
Article MathSciNet Google Scholar
Tang YC, Liu WC, Xu AC (2017) Statistical inference for zero-and-one-inflated Poisson models. Stat Theory Relat Fields 1:216–226
Article Google Scholar
Tierney L (1994) Markov chains for exploring posterior distributions (with discussions). Ann Stat 26:1701–1762
Article Google Scholar
Vandenbroek J (1995) A score test for zero inflation in a Poisson distribution. Biometrics 51:738–743
Article MathSciNet Google Scholar
Wit E, Edwin vdh, Jan-Willem R (2012) All models are wrong...: an introduction to model uncertainty. Stat Neerl 66:217–236
Article MathSciNet Google Scholar
Wu CFJ (1983) On the convergence properties of the EM Algorithm. Ann Stat 11:95–103
Article MathSciNet Google Scholar
Xie FC, Lin JG, Wei BC (2014) Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics. J Appl Stat 41:1383–1392
Article MathSciNet Google Scholar
Xu HY, Xie M, Goh TN (2014) Objective Bayes analysis of zero-inflated Poisson distribution with application to healthcare data. IIE Trans 46:843–852
Article Google Scholar
Zhang C, Tian G-L, Ng K-W (2016) Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods. Stat Its Interface 9:11–32
Article MathSciNet Google Scholar
Zhu H, Luo S, DeSantis SM (2017) Zero-inflated count models for longitudinal measurements with heterogeneous random effects. Stat Methods Med Res 26:1774–1786
Article MathSciNet Google Scholar

Download references

Acknowledgements

The research is supported by the Natural Science Foundation of China (Nos. 11271136, 81530086, 11671303, 11201345, 11671303), the 111 Project of China (No. B14019), the Natural Science Foundation of Zhejiang Province (No. LY15G010006) and the China Postdoctoral Science Foundation (No. 2015M572598).

Author information

Authors and Affiliations

KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, 200241, China
Wenchen Liu & Yincai Tang
Department of Mathematics, Wenzhou University, Zhejiang, 325035, China
Ancha Xu

Authors

Wenchen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yincai Tang
View author publications
You can also search for this author in PubMed Google Scholar
Ancha Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wenchen Liu or Yincai Tang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Appendix A.1

Proof of Theorem 3.1

The proof depends on the smoothness of $Q(\varvec{\eta },\varvec{\eta }^{(k)}) = E[\ell _{c}(\varvec{\eta }|{\varvec{Y}}, {\varvec{B}}_{1},{\varvec{B}}_{2})|{\varvec{Y}}, \varvec{\eta }^{(k)}]$. We rewrite

$$\begin{aligned} Q(\varvec{\eta },\varvec{\eta }^{(k)}) = Q_{1}(\varvec{\eta },\varvec{\eta }^{(k)}) + Q_{2}(\varvec{\eta },\varvec{\eta }^{(k)}) + Q_{3}(\varvec{\eta },\varvec{\eta }^{(k)}), \end{aligned}$$

where

$$\begin{aligned}&Q_{1}(\varvec{\eta },\varvec{\eta }^{(k)})=\sum _{i=1}^{n} E\bigg [(1-B_{1i})\big (Y_{i}{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}-\mathrm {e}^{{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}}\big )\big |{\varvec{Y}},\varvec{\eta }^{(k)}\bigg ],\\&Q_{2}(\varvec{\eta },\varvec{\eta }^{(k)})=\sum _{i=1}^{n} E\bigg [B_{1i}\varvec{\omega }_{1i}^{\mathrm {T}} {\varvec{\gamma }}_{1}-\ln \big (1+ \mathrm {e}^{\varvec{\omega }_{1i}^{\mathrm {T}} {\varvec{\gamma }}_{1}}\big )-(1-B_{1i})\ln (Y_{i}!)| {\varvec{Y}},\varvec{\eta }^{(k)}\bigg ], \end{aligned}$$

and

$$\begin{aligned} Q_{3}(\varvec{\eta },\varvec{\eta }^{(k)})=\sum _{i=1}^{n} E\bigg [B_{2i}\varvec{\omega }_{2i}^{\mathrm {T}} {\varvec{\gamma }}_{2}-\ln \big (1+\mathrm {e}^ {\varvec{\omega }_{2i}^{\mathrm {T}}{\varvec{\gamma }}_{2}}\big )\bigg | {\varvec{Y}},\varvec{\eta }^{(k)}\bigg ]. \end{aligned}$$

$\square $

Simple calculation yields

$$\begin{aligned} \begin{aligned}&Q_{1}(\varvec{\eta },\varvec{\eta }^{(k)})\\&\quad = \sum _{i=1}^{n}I\{Y_{i}=0\}\left[ \frac{(1-p_{1i}^{(k)}) P(V=0)}{p_{1i}^{(k)}p_{2i}^{(k)}+(1-p_{1i}^{(k)}) P(V_{i}^{(k)}=0)}\right] \bigg (Y_{i}{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}-\mathrm {e}^{{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}}\bigg )\\&\qquad +\sum _{i=1}^{n}I\{Y_{i}=1\}\left[ \frac{(1-p_{1i}^{(k)}) P(V=1)}{p_{1i}^{(k)}(1-p_{2i}^{(k)})+(1-p_{1i}^{(k)}) P(V_{i}^{(k)}=1)}\right] \bigg (Y_{i}{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}-\mathrm {e}^{{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}}\bigg )\\&\qquad +\sum _{i=1}^{n}I\{Y_{i}\ge 2\} \big (Y_{i}{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}-\mathrm {e}^{{\varvec{z}}_{i}^{\mathrm {T}} {\varvec{\beta }}}\big ), \end{aligned} \end{aligned}$$

which is continuous in $\varvec{\eta }$ and $\varvec{\eta }^{(k)}$. A similar conclusion can be applied to $Q_{2}(\varvec{\eta },\varvec{\eta }^{(k)})$ and $Q_{3}(\varvec{\eta },\varvec{\eta }^{(k)})$. It is easy to find that the Eq. (10) in Wu (1983) for $Q(\varvec{\eta },\varvec{\eta }^{(k)})$ is correct. According to the GEM algorithm, it is easy to that

$$\begin{aligned}&\ell _{c}\bigg ({\varvec{\gamma }}_{1}^{(k+1)}\bigg |{\varvec{Y}}, {\varvec{B}}_{1}^{(k+1)},{\varvec{B}}_{2}^{(k+1)}\bigg )-\ell _{c} \bigg ({\varvec{\gamma }}_{1}^{(k)}\bigg |{\varvec{Y}},{\varvec{B}}_{1}^{(k+1)}, {\varvec{B}}_{2}^{(k+1)}\bigg )\nonumber \\&\quad \ge \bigg |\bigg |{\varvec{\gamma }}_{1}^{(k+1)} -{\varvec{\gamma }}_{1}^{(k)}\bigg |\bigg |^{2}/2, \end{aligned}$$

(13)

$$\begin{aligned} \nonumber \\&\ell _{c}\bigg ({\varvec{\gamma }}_{2}^{(k+1)}\bigg |{\varvec{Y}}, {\varvec{B}}_{1}^{(k+1)},{\varvec{B}}_{2}^{(k+1)}\bigg )-\ell _{c} \bigg ({\varvec{\gamma }}_{2}^{(k)}\bigg |{\varvec{Y}},{\varvec{B}}_{1}^{(k+1)}, {\varvec{B}}_{2}^{(k+1)}\bigg )\nonumber \\&\quad \ge \bigg |\bigg |{\varvec{\gamma }}_{2}^{(k+1)} -{\varvec{\gamma }}_{2}^{(k)}\bigg |\bigg |^{2}/2 \end{aligned}$$

(14)

and

$$\begin{aligned} \ell _{c}\bigg ({\varvec{\beta }}^{(k+1)}\bigg |{\varvec{Y}}, {\varvec{B}}_{1}^{(k+1)},{\varvec{B}}_{2}^{(k+1)}\bigg )&-\ell _{c}\bigg ({\varvec{\beta }}^{(k)}\bigg |{\varvec{Y}}, {\varvec{B}}_{1}^{(k+1)},{\varvec{B}}_{2}^{(k+1)}\bigg )\nonumber \\&\ge \bigg |\bigg |\bigg ({\varvec{\beta }}^{(k+1)}-{\varvec{\beta }}^{(k)}\bigg )\bigg |\bigg |^{2}/2 \end{aligned}$$

(15)

respectively. And according to Eqs. (13), (14), and (15), it is easy to find that $||\varvec{\eta }^{(k+1)}-\varvec{\eta }^{(k)}||\rightarrow 0$ as $k \rightarrow \infty $. So according to Theorem 2 and Theorem 5 in Wu (1983), the conclusion is obvious.

1.2 Appendix A.2

The full conditional distribution of ${\varvec{\beta }}, {\varvec{\gamma }}_1$ and ${\varvec{\gamma }}_2$ with Normal prior. After the data augmentation step, we have the joint posterior of ${\varvec{\beta }}$, ${\varvec{\gamma }}_{1}$ and ${\varvec{\gamma }}_{2}$ as follows:

$$\begin{aligned} \begin{aligned}&{\pi [{\varvec{\beta }},{\varvec{\gamma }}_{1}, {\varvec{\gamma }}_{2}|\text {rest},{\varvec{Y}}]} \propto \exp \left\{ \left( \sum _{i=1}^{n}(1-B_{1i})V_{i} {\varvec{z}}_{i}\right) ^{\mathrm {T}}{\varvec{\beta }}\right\} \exp \\&\quad \left\{ \left( \sum _{i=1}^{n}B_{1i}\varvec{\omega _{1i}}\right) ^{\mathrm {T}}\varvec{\gamma _{1}}\right\} \exp \left\{ \left( \sum _{i=1}^{n}B_{2i}\varvec{\omega }_{2i}\right) ^{\mathrm {T}}{\varvec{\gamma }}_{2}\right\} \\&\quad \prod _{i=1}^{n}\frac{\exp \{-(1-B_{1i})\exp {({\varvec{z}}_{i}^ {\mathrm {T}}{\varvec{\beta }})}\}}{ (1+\exp {(\varvec{\omega }_{1i}^{\mathrm {T}} {\varvec{\gamma }}_{1})})(1+\exp {(\varvec{\omega }_{2i} ^{\mathrm {T}}{\varvec{\gamma }}_{2})})} \times \Pi ({\varvec{\beta }},{\varvec{\gamma }}_{1}, {\varvec{\gamma }}_{2}), \end{aligned} \end{aligned}$$

where the $\Pi ({\varvec{\beta }},{\varvec{\gamma }}_{1},{\varvec{\gamma }}_{2})$ is the prior of the parameters ${\varvec{\beta }}, {\varvec{\gamma }}_{1}$ and ${\varvec{\gamma }}_{2}$. Let $N({\varvec{\beta }}_{0},\sigma _{\beta }{\varvec{I}}_{q})$, $N({\varvec{\gamma }}_{01},\sigma _{\gamma _{1}}{\varvec{I}}_{r1})$ and $N({\varvec{\gamma }}_{02},\sigma _{\gamma _{2}}{\varvec{I}}_{r2})$ be the priors for parameters ${\varvec{\beta }}, {\varvec{\gamma }}_{1}$ and ${\varvec{\gamma }}_{2}$ respectively and assume that they are mutually independent. The full conditional distributions of ${\varvec{\beta }}$, ${\varvec{\gamma }}_{1}$ and ${\varvec{\gamma }}_{2}$ are not standard distributions. Their densities are as follows:

$$\begin{aligned}&{\pi [{\varvec{\beta }}|\text {rest},{\varvec{Y}}]}\propto (\sigma _{\beta })^{-q}\exp \left\{ -\left( \frac{1}{2}\sigma _{\beta }^{-2}\right) ({\varvec{\beta }}-{\varvec{\beta }}_{0})^{\mathrm {T}} ({\varvec{\beta }}-{\varvec{\beta }}_{0})\right\} \nonumber \\&\quad \times \exp \left\{ \left( \sum _{i=1}^{n}(1-B_{1i})V_{i}{\varvec{z}}_{i} \right) ^{\mathrm {T}} {\varvec{\beta }}\right\} \prod _{i=1}^{n}\exp {\left\{ -(1-B_{1i})\exp {\bigg ({\varvec{z}}_{i}^ {\mathrm {T}}{\varvec{\beta }}\big )}\right\} }, \end{aligned}$$

(16)

$$\begin{aligned}&{\pi [{\varvec{\gamma }}_{1}|\text {rest},{\varvec{Y}}]}\propto (\sigma _{\gamma _{1}})^{-r_{1}}\exp \left\{ -\left( \frac{1}{2} \sigma _{\gamma _{1}} ^{-2}\right) ({\varvec{\gamma }}_{1}-{\varvec{\gamma }}_{01})^{\mathrm {T}} ({\varvec{\gamma }}_{1}-{\varvec{\gamma }}_{01})\right\} \nonumber \\&\quad \times \exp \left\{ \left( \sum _{i=1}^{n}B_{1i}\varvec{\omega }_{1i} \right) ^{\mathrm {T}} {\varvec{\gamma }}_{1}\right\} \prod _{i=1}^{n}\bigg (1+\exp {\bigg (\varvec{\omega }_{1i}^{\mathrm {T}} {\varvec{\gamma }}_{1}\bigg )}\bigg )^{-1} \end{aligned}$$

(17)

and

$$\begin{aligned} \begin{aligned}&{\pi [{\varvec{\gamma }}_{2}|\text {rest},{\varvec{Y}}]}\propto (\sigma _{\gamma _{2}})^{-r_{2}}\exp \left\{ -\left( \frac{1}{2}\sigma _{\gamma _{2}}^ {-2}\right) ({\varvec{\gamma }}_{2}-{\varvec{\gamma }}_{02})^{\mathrm {T}} ({\varvec{\gamma }}_{2}-{\varvec{\gamma }}_{02})\right\} \\&\quad \times \exp \left\{ \left( \sum _{i=1}^{n}B_{2i}\varvec{\omega }_{2i}\right) ^{\mathrm {T}} {\varvec{\gamma }}_{2}\right\} \prod _{i=1}^{n}\bigg (1+\exp {\bigg (\varvec{\omega }_{2i}^{\mathrm {T}} {\varvec{\gamma }}_{2}\bigg )}\bigg )^{-1}. \end{aligned} \end{aligned}$$

(18)

It is also easy to get the second order partial derivative of ${\pi [{\varvec{\beta }}|\text {rest},{\varvec{Y}}]}$,

$$\begin{aligned} \frac{\partial ^{2} {\pi [{\varvec{\beta }}|\text {rest},{\varvec{Y}}]}}{\partial {\varvec{\beta }}\partial {\varvec{\beta }}^{T}} =\frac{1}{2}\sigma _{\beta }^{-2}{\varvec{I}}_{q}+\sum _{i=1}^{n} (1-B_{1i})\exp ({\varvec{z}}_{i}^{T}{\varvec{\beta }}) {\varvec{z}}_{i}{\varvec{z}}_{i}^{T}>0. \end{aligned}$$

So ${\pi [{\varvec{\beta }}|\text {rest},{\varvec{Y}}]}$ are log-concave. The log-concave property of conditional densities ${\pi [{\varvec{\gamma }}_{1}|\text {rest},{\varvec{Y}}]}$ and ${\pi [{\varvec{\gamma }}_{2}|\text {rest},{\varvec{Y}}]}$ can be proved similarly. So ARS can be used to sample ${\varvec{\beta }}$, ${\varvec{\gamma }}_{1}$ and ${\varvec{\gamma }}_{2}$ from their respective full conditional distributions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, W., Tang, Y. & Xu, A. Zero-and-one-inflated Poisson regression model. Stat Papers 62, 915–934 (2021). https://doi.org/10.1007/s00362-019-01118-7

Download citation

Received: 10 December 2018
Revised: 18 April 2019
Published: 28 June 2019
Issue Date: April 2021
DOI: https://doi.org/10.1007/s00362-019-01118-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Zero-and-one-inflated Poisson regression model

Abstract

Access this article

Similar content being viewed by others

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

A tutorial on Bayesian multi-model linear regression with BAS and JASP

One-Inflated Zero-Truncated Poisson Distribution: Statistical Properties and Real Life Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Appendix

1.1 Appendix A.1

Proof of Theorem 3.1

1.2 Appendix A.2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Zero-and-one-inflated Poisson regression model

Abstract

Access this article

Similar content being viewed by others

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

A tutorial on Bayesian multi-model linear regression with BAS and JASP

One-Inflated Zero-Truncated Poisson Distribution: Statistical Properties and Real Life Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Appendix

Appendix

1.1 Appendix A.1

Proof of Theorem 3.1

1.2 Appendix A.2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation