1 Introduction

Time series of counts are emerging in almost every domain of applications now, be in economics, medicine, or life sciences. Some examples include the monthly cases of crimes and offenses as studied in Bakouch and Ristić (2010), Ristić et al. (2009, 2012), Bourguignon and Vasconcellos (2015), Mamode Khan et al. (2020a), the daily number of newly infected and deaths due to SARs-Cov 2 patients (Mamode Khan et al. 2020b), the weekly number of syphilis cases, (Bourguignon et al. 2018), the number of daily fatal road traffic accidents (Pedeli and Karlis 2011), the tick by tick intra-day transactions of stocks (Pedeli and Karlis 2013; Sunecher et al. 2018) and amongst others. In such applications, the counting series are usually characterized by frequent low figures that include mainly zeros and ones and this happens mostly when the unit of the collection is at a very micro level. Likewise, the daily SARs-Cov 2 death and newly infected series in small island developing states like Barbados, Guinea-Bissau, Sao Tome consist of mainly 0’s and 1’s. The same remark can be made to the sex offenses, arsenic, domestic violence data that are available at http://www.forecastingprinciples.com. The interested reader may consult more examples in Li et al. (2015) and the references therein. Such excess of zeros or ones leads to overdispersion in the series. This paper, therefore, proposes an integer-valued time series model of auto-regressive nature of order 1 to model such data series but with a zero-and-one-inflated type innovation structure.

McKenzie (1985) and Al-Osh and Alzadi (1987), independently, introduced the integer-valued autoregressive (INAR) process \(\{X_{t}\}_{t \in \mathbb {N}_{0}}\) with one lag using a binomial thinning operator as follows

$$\begin{aligned} X_{t}=\alpha \circ X_{t-1}+\epsilon _{t}, \quad t\in \mathbb {N} \end{aligned}$$
(1)

where \(0\le \alpha <1\), \(\{\epsilon _{t}\}_{t \in \mathbb {N}}\) is a sequence of independent and identically distributed integer-valued random variables, called innovations, with \(\epsilon _t\) independent of \(X_{t-k}\) for all \(k\ge 1\), \(E(\epsilon _{t})=\mu _{\epsilon }\) and \(\hbox{Var}(\epsilon _{t})=\sigma ^{2}_{\epsilon }\). The binomial thinning operator “\(\circ \)” is defined by Steutel and van Harn (1979) as \(\alpha \circ X_{t-1}=\sum _{j=1}^{X_{t-1}}Y_{j}\), where the counting series \(\{Y_{j}\}_{j \ge 1}\) is a sequence of independent and identically distributed Bernoulli random variables with \(P(Y_{j}=1)=1-P(Y_{j}=0)=\alpha \). From the results of Al-Osh and Alzadi (1987), we have that \(\alpha \in [0,1)\) and \(\alpha =1\) are the conditions of stationarity and non-stationarity of the process \(\{X_{t}\}_{t \in \mathbb {N}_{0}}\), respectively. Also, \(\alpha =0\) (\(\alpha >0\)) implies the independence (dependence) of the observations of \(\{X_{t}\}_{t \in \mathbb {N}_{0}}\). The following representation for the marginal distribution of the INAR(1) model, provided by Al-Osh and Alzadi (1987), is expressed in terms of the innovation sequence \(\epsilon _{t}\)

$$\begin{aligned} X_{t}=\sum _{i=0}^{\infty }\alpha ^{i}\circ \epsilon _{t-i}. \end{aligned}$$
(2)

Modeling of INAR(1) time series based on (1) was first introduced using the Poisson marginal distribution by Al-Osh and Alzadi (1987) and McKenzie (1988), denoted by PINAR(1). It is a simple model and is appropriate for modeling equidispersed time series data. In many practical scenarios, as discussed above, data are overdispersed. To cater for this phenomenon in counting series, Alzadi and Al-Osh (1988) considered INAR(1) processes with geometric marginal distribution for time series of overdispersed counts. Other useful overdispersed INAR(1) models have been proposed, such as the negative binomial INAR(1) process (McKenzie 1986), generalized Poisson INAR(1) process (Alzadi and Al-Osh 1993). Jazi et al. (2012a) and Schweer and Weiß (2014) introduced an overdispersed INAR(1) model with geometric and compound Poisson innovations, respectively. Basically, there is an ongoing vast literature on the handling of the overdispersion in the simple INAR process (Weiß 2008; Awale et al. 2021; Huang and Zhu 2021; Weiß 2020). However, we note that the construction of the INAR process, in addition to the self-decomposability properties, becomes simpler with assuming the distribution of the innovation series, and without compromising on the marginal distribution of the counting series (See Bourguignon et al. 2019; Livio et al. 2018). In fact, Livio et al. (2018) confirms that such a later INAR process with the pre-specified innovation yields lower AICs than other competing INAR(1)s in Mohammadpour et al. (2018).

On the other hand, where the data set contains a large number of zeros, Jazi et al. (2012b) introduced an INAR(1) process with zero-inflated Poisson innovations and showed that the marginal distribution of the process is also zero-inflated. However, in the construction of the INAR(1) process, it is not always direct to derive the distribution of the counting series similar to the distribution of the zero-inflated innovation series. In this sense, Barreto-Souza (2015), Bakouch and Ristić (2010) and Bourguignon et al. (2018) studied novel INAR(1) models with zero-modified geometric and zero-truncated Poisson marginal distribution, respectively, similar to the construction process in Livio et al. (2018). Furthermore, Li et al. (2015) developed the mixed INAR(1) process with zero-inflated generalized power series innovations, while Bakouch et al. (2018) investigated the zero-inflated geometric INAR(1) process with random coefficient until recently, Sharafi et al. (2020) proposed the INAR(1) model with zero-modified Poisson–Lindley innovations.

However, when a data set is subject to zero inflation along with one-inflation, the previous models are not very useful. In this research, we restrict our attention to modeling such data. Qi et al. (2019) introduced a stationary INAR(1) process with zero-and-one-inflated Poisson innovations. Also, Mohammadi et al. (2021) introduced the ZOIPLINAR(1) model which is the stationary INAR(1) model with zero-and-one-inflated Poisson–Lindley distributed innovations. The geometric distribution is one of the most important distributions used to analyze count data. Many authors such as McKenzie (1986), Ristić et al. (2009, 2012), Jazi et al. (2012a, b) used geometric distribution to analyze count time series data. This fact motivated us to introduce the flexible INAR(1) model with zero-and-one-inflated geometric innovations to model count data, especially in the analyzing of the COVID-19 real data.

It should be mentioned that in the COVID-19 data time series analysis based on the PACF plot, in most applications, it seems that order 1 is not suitable and the higher-order time series are needed. Recently, Foroughi et al. (2021) introduced a new portmanteau test to examine the null hypothesis \(H_0: X_t \sim \hbox{GINAR}(1)\) versus the alternative \(H_1: X_t \sim \hbox{GINAR}(p)\) for \(p>1\) and a wide group of INAR processes, called generalized INAR. They developed some portmanteau test statistics to check the adequacy of the fitted model. In this paper, we use the above test statistics to check the adequacy of our introduced model which is applied to the practical data example.

The paper is organized as follows. In Sect. 2, we introduce and construct a flexible INAR(1) model and obtain some of its statistical and conditional properties. Section 3 is devoted to parameter estimation of the model which is included two estimation methods, maximum likelihood, and conditional least square estimators. In Sect. 4, we present some simulation experiments and real-life data applications to assess the performance of the proposed zero-and-one-inflated INAR model.

2 Model Construction and Properties

In this section, we introduce a flexible INAR(1) process with zero-and-one-inflated geometric-distributed innovations denoted by INARZOIG(1) and present some of its properties.

Based on the Eq. (1), we define the INARZOIG(1) as follow:

$$\begin{aligned} X_{t}=\alpha \circ X_{t-1}+\epsilon _{t},\quad t\in \mathbb {N} \end{aligned}$$
(3)

where \(0<\alpha <1\) and the innovation process \(\{\epsilon _{t}\}\) is said to have zero-and-one-inflated geometric distribution, denoted by ZOIG\((\phi _{0},\phi _{1},\theta )\), with the following probability mass function (pmf),

$$\begin{aligned} P(\epsilon _{t}=k)=\left\{ \begin{array}{ll} \phi _{0}+(1-\phi _{0}-\phi _{1})\frac{1}{1+\theta }, &{} \mathrm{if}\, k=0 \\ \phi _{1}+(1-\phi _{0}-\phi _{1})\frac{\theta }{(1+\theta )^2}, &{} \mathrm{if}\, k=1 \\ (1-\phi _{0}-\phi _{1})\frac{1}{1+\theta }(\frac{\theta }{1+\theta })^k,&{} \mathrm{if}\, k=2,3,\ldots , \end{array} \right. \end{aligned}$$
(4)

where \(0\le \phi _{0},\phi _{1}<1\), \(\theta >0\). The parameter \(\theta \) is the mean of the traditional geometric distribution and the parameters \(\phi _{0}\) and \(\phi _{1}\) denote the unknown proportions for incorporating extra zeros and ones than those allowed by the considered a traditional geometric distribution, respectively. Also, \(\epsilon _t\) is independent of \(X_s\) for all \(t>s\) and it is independent of the counting series contained in the binomial thinning operator “\(\circ \).”

Based on Du and Li (1991) and Dion et al. (1995), it can be easily shown that this process is stationary if and only if \(0<\alpha <1\). This process is reduced into INARZIG(1) when \(\phi _{1}=0\) and INAROIG(1) when \(\phi _{0}=0\), respectively.

In the following proposition, some moments and conditional moments of the INARZOIG(1) process are summarized for the coming use.

Proposition 1

Let \(\{X_{t}\}\) be the process defined by (3). Then

  1. a)

    \(E(X_{t} \vert X_{t-1})=\alpha X_{t-1}+\mu _{\epsilon }\)

  2. b)

    \(\mu =E(X_{t})=\frac{\mu _{\epsilon }}{1-\alpha }\),

  3. c)

    \(\hbox{Var}(X_{t}\vert X_{t-1})=\alpha (1-\alpha )X_{t-1}+\sigma ^{2}_{\epsilon }\),

  4. d)

    \(\sigma ^{2}=\hbox{Var}(X_{t})=\frac{\alpha \mu _{\epsilon }+\sigma ^{2}_{\epsilon }}{1-\alpha ^{2}}\),

  5. e)

    \(\gamma (k)=\hbox{Cov}(X_{t},X_{t-k})=\alpha ^{k}\gamma (0)\), \(\rho (X_{t},X_{t-k})=\hbox{Corr}(X_{t},X_{t-k})=\alpha ^{k}\),

  6. f)

    \(\psi _{X_{t}}(s)=\prod _{i=0}^{\infty }\left[ \phi _{0}+\phi _{1}(1-\alpha ^{i}+\alpha ^{i}s)+\frac{\phi _{2}}{1+\theta \alpha ^{i}-\theta \alpha ^{i} s}\right] \),

where \(\phi _{2}=1-\phi _{0}-\phi _{1}\), \(\mu _{\epsilon }= E(\epsilon _{t})=\phi _{1}+\phi _{2}\theta \) and \(\sigma ^{2}_{\epsilon }= \hbox{Var}(\epsilon _{t})=\mu _{\epsilon }-\mu _{\epsilon }^{2}+2\phi _{2} \theta ^{2}\).

The proof of Proposition 1 is similar to Theorem 1 in Qi et al. (2019), we omit the details here and refer the reader to Qi et al. (2019).

Using conditional mean is one of the most common techniques for forecasting time series processes. In the next proposition, conditional mean and variance of INARZOIG(1) process is obtained.

Proposition 2

For INARZOIG(1) process, the (h + 1)-step ahead forecast which is conditional mean, and the conditional variance are

$$\begin{aligned} \hat{X}_{t+h}=E(X_{t+h}\vert X_{t-1})=\alpha ^{h+1}X_{t-1}+\frac{(1-\alpha ^{h+1})\mu _{\epsilon }}{(1-\alpha )}, \end{aligned}$$
(5)

and

$$\begin{aligned} \hbox{Var}(X_{t+h}\vert X_{t-1}) &= \alpha ^{h+1}(1-\alpha ^{h+1})X_{t-1}+\frac{\sigma ^{2}_{\epsilon } (1-\alpha ^{2(h+1)})}{1-\alpha ^{2}} \\ & {\quad}+ \frac{\mu _{\epsilon }(1-\alpha ^{h})(\alpha -\alpha ^{h+2})}{1-\alpha ^{2}}, \end{aligned}$$
(6)

respectively.

Proof

The proof of Proposition 2 is given in Appendix.□

It is clear that \(E(X_{t+h}\vert X_{t-1})\rightarrow \frac{\mu _{\epsilon }}{(1-\alpha )}\) as \(h\rightarrow \infty \), which is the unconditional mean of the process. Also, the (h + 1)-step ahead conditional variance converges to \(\frac{\alpha \mu _{\epsilon }+\sigma ^{2}_{\epsilon }}{1-\alpha ^{2}}=\sigma ^{2}\) as \(h\rightarrow \infty \).

According to Proposition 1, the Fisher index of dispersion for the model can be calculated as

$$\begin{aligned} \hbox{FI}_{X}=\frac{\sigma ^{2}}{\mu }=\frac{\alpha +FI_{\epsilon }}{1+\alpha }, \end{aligned}$$
(7)

where \(\hbox{FI}_{\epsilon }=1-\mu _{\epsilon }+2 \frac{\theta ^{2}\phi _{2}}{\mu _{\epsilon }}\), then the dispersion of the INARZOIG(1) process is similar to the dispersion of its innovation process, i.e., it is overdispersed (underdispersed) if the innovations \(\{\epsilon _{t}\}\) is overdispersed (underdispersed).

Remark 1

After some calculation, it is easy to show that \(\epsilon _{t}\) is overdispersed if \(\theta > \frac{\phi _{1}}{\sqrt{2\phi _{2}}-\phi _{2}}\), it is underdispersed if \(0<\theta <\frac{\phi _{1}}{\sqrt{2\phi _{2}}-\phi _{2}}\) and it is equidispersed if \(\theta =\frac{\phi _{1}}{\sqrt{2\phi _{2}}-\phi _{2}}\).

Since the model (3) forms a stationary discrete-time Markov chain, the transition probabilities obtained as (see, e.g., Weiß 2008):

$$\begin{aligned} P_{ij}=P(X_{t}=j\vert X_{t-1}=i)=\sum _{k=0}^{{\rm min}(i,j)} {i \atopwithdelims ()k}\alpha ^{k}(1-\alpha )^{i-k}P(\epsilon _{t}=j-k) \end{aligned}$$
(8)

where \(P(\epsilon _{t}=j)\) is the pmf of \(\{\epsilon _{t}\}\) defined by (4) and ij = 0, 1, ....

Hence, the marginal probability function of \(X_{t}\) of INARZOIG(1) is obtained as:

$$\begin{aligned} P_{j}&=P(X_{t}=j)=\sum _{i=0}^{\infty }P(X_{t}=j\vert X_{t-1}=i)P(X_{t-1}=i) \\&=\sum _{i=0}^{\infty }P_{ij}P_{i}=\sum _{i=0}^{\infty }\sum _{k=0}^{{\rm min}(i,j)}{i \atopwithdelims ()k}\alpha ^{k}(1-\alpha )^{i-k}P(\epsilon _{t}=j-k)P_{i},\quad j=0, 1, \ldots . \end{aligned}$$
(9)

Also, the joint probability of the processes using the first-order dependence can be calculated as:

$$\begin{aligned}& f(i_{1},i_{2},\ldots ,i_{n}) = P(X_{1}=i_{1},X_{2}=i_{2},\ldots , X_{n}=i_{n}) \\& = P_{i_{1}}\prod _{s=1}^{n-1}\left\{\sum _{k=0}^{{\rm min}(i_{s},i_{s+1})}{i_{s} \atopwithdelims ()k}\alpha ^{k}(1-\alpha )^{i_{s}-k}P(\epsilon _{t}=i_{s+1}-k)\right\}. \end{aligned}$$
(10)

2.1 Distributions of Lengths of Zeros and Ones

Mood (1940) presented a definition of the number of the “succession” of similar events preceded and succeeded by different events, and called it “the Run.” In this section, we find the expected length of the runs of zeros and the lengths of the runs of ones for the INARZOIG(1) process.

Theorem 1

The expected length of the runs of zeros for the INARZOIG(1) process is

$$\begin{aligned} \eta _{0}=\frac{1+\theta }{(1+\theta )\phi _{1}+\theta \phi _{2}} \end{aligned}$$
(11)

and the expected length of the runs of ones for the INARZOIG(1) process is

$$\begin{aligned} \eta _{1}=\frac{(1+\theta )^2}{(1+\theta )^2[1-\alpha \phi _{0}- (1-\alpha )\phi _{1}]-\phi _{2}(\theta +\alpha )}. \end{aligned}$$
(12)

Proof

The zero-to-zero transition probability for the INARZOIG(1) process is obtained as:

$$\begin{aligned} P_{00}=P(X_{t}=0\vert X_{t-1}=0)=P(\epsilon _{t}=0)=\phi _{0}+\frac{\phi _{2}}{1+\theta }. \end{aligned}$$
(13)

Therefore, the transition probabilities from zero to nonzero for the INARZOIG(1) process can be obtained as

$$p^{*}=P(X_{t}\ne 0\vert X_{t-1}=0)=1-\left\{\phi _{0}+\frac{\phi _{2}}{1+\theta }\right\}=\phi _{1} +\frac{\theta \phi _{2}}{1+\theta}.$$
(14)

Since the run length of zeros is defined as the number of zeros between two nonzero values, it can be shown that it follows from a geometric distribution with the parameter \(p^{*}\), and hence, the expected run length of zeros in the process is \(\frac{1}{p^{*}}\). The expected run length of ones can be obtained similarly.□

The expected length of the runs of zeros is independent of \(\alpha \). If \(\phi _{0}=0\) or \(\phi _{1}=0\), we obtain the expected length of the runs for the INAROIG(1) or INARZIG(1) process, respectively.

Theorem 2

The proportion of zeros in the INARZOIG(1) process is given by

$$\begin{aligned} P(X_{t}=0)=\prod _{i=0}^{\infty }\left[ \phi _{0}+\phi _{1}(1-\alpha ^{i})+ \frac{\phi _{2}}{1+\theta \alpha ^{i}} \right] , \end{aligned}$$
(15)

and the proportion of ones in the INARZOIG(1) process is

$$\begin{aligned} P(X_{t}=1)=\sum _{j=0}^{\infty }\left[ \left(\phi _{1}\alpha ^{j}+\frac{\phi _{2} \theta \alpha ^{j}}{(1+\theta \alpha ^{j})^{2}}\right)\prod _{i\ne j}\left[\phi _{0}+\phi _{1}(1-\alpha ^{i})+\frac{\phi _{2}}{1+\theta \alpha ^{i}}\right]\right] . \end{aligned}$$
(16)

Proof

Using part (f) of Proposition 1 and based on the following relationship between the probability generating function (pgf) and pmf,

$$\begin{aligned} P(X_{t}=k)=\frac{1}{k!}\psi _{X_{t}}^{(k)}(0) \end{aligned}$$

where \(\psi _{X_{t}}^{(k)}(.)\) denotes the kth derivative of the pgf \(\psi _{X_{t}}(.)\), the proof is completed by calculating the following statements.

$$\begin{aligned} P(X_{t}=0)=\psi ^{(0)}_{X_{t}}(0)=\psi _{X_{t}}(0), \end{aligned}$$

and

$$\begin{aligned} P(X_{t}=1)=\psi ^{(1)}_{X_{t}}(0). \end{aligned}$$

3 Parameter Estimation

Let \(X=(X_1,\ldots ,X_{n})\) be observations from the model (3) and \({{\boldsymbol{\lambda }}} \in \Lambda =\{(\alpha ,\theta ,\phi _{0},\phi _{1})^{'};0<\alpha<1,0\le \phi _{0}<1, 0\le \phi _{1}<1, \theta >0\}\) denote the parameter vector. In the study of integer-valued time series, different estimation methods are applied. In this section, we are going to estimate the parameters of the INARZOIG(1) model using conditional maximum likelihood (CML) and conditional least squares (CLS) estimation methods.

3.1 Conditional Maximum Likelihood Estimation

For simplicity of notations, we can write the likelihood function through the joint probability function (10) as

$$\begin{aligned}P(X_{1}&=x_{1})\prod _{i=1}^{n-1}P_{{\boldsymbol{\lambda }}}(X_{i+1}=x_{i+1}\vert X_{i}=x_{i})\\ & = P_{X_{1}}\prod _{i=1}^{n-1}\left\{\sum _{k=0}^{{\rm min}(X_{i},X_{i+1})}{X_{i} \atopwithdelims ()k}\alpha ^{k}(1-\alpha )^{X_{i}-k}P(\epsilon _{t}=X_{i+1}-k)\right\}, \end{aligned}$$
(17)

where \(P_{X_{1}}\) is the pmf of \(X_{1}\) and \(P_{{\boldsymbol{\lambda }}}(X_{i+1}\vert X_{i})\) is the conditional pmf. To overcome the complexity of the marginal distribution, a simple approach is to find the conditional pmf conditioned on the first observation \(X_{1}\), essentially ignoring the dependency on the initial value and obtain the conditional maximum likelihood (CML) estimate given \(X_{1}\) as an estimate of \({\boldsymbol{\lambda }}\) by maximizing the conditional log-likelihood.

$$\begin{aligned} l _{n}({\boldsymbol{\lambda }}\vert X_{1})=\sum _{i=1}^{n}\hbox{log} P_{{\boldsymbol{\lambda }}}(X_{i+1}\vert X_{i}), \end{aligned}$$
(18)

over \({\boldsymbol{\lambda }}\). Since there is no closed form for the CML estimates, these estimates are achieved using numerical methods. The asymptotic properties of the CML estimators follow from Freeland and McCabe (2004).

3.2 Conditional Least Squares Estimation

In this subsection, we describe the estimation of the unknown parameters of the INARZOIG(1) process using the two-step CLS estimation method proposed by Karlsen and Tjøstheim (1988) which is conducted by the following two steps.

Step 1

Let \({\boldsymbol{\beta }}_{1}=(\alpha ,\mu _{\epsilon })^{'}\), then the conditional least square (CLS) estimators of the parameters \(\alpha \) and \(\mu _{\epsilon }\) are obtained by minimizing the function

$$\begin{aligned} Q_{n}({\boldsymbol{\beta }}_{1})=\sum _{t=2}^{n}[X_{t}-g_{1} ({\boldsymbol{\beta }}_{1},X_{t-1})]^{2}, \end{aligned}$$
(19)

where \(g_{1}({\boldsymbol{\beta }}_{1},X_{t-1})=E(X_{t}\vert X_{t-1})=\alpha X_{t-1}+\mu _{\epsilon }\), and are given by

$$\begin{aligned} \hat{\alpha }_{\mathrm{cls}}=\frac{(n-1)\sum _{t=2}^{n}X_{t}X_{t-1} -\sum _{t=2}^{n}X_{t}\sum _{t=2}^{n}X_{t-1}}{(n-1)\sum _{t=2}^{n} X_{t-1}^{2}-(\sum _{t=2}^{n}X_{t-1})^{2}}, \end{aligned}$$
(20)

and

$$\begin{aligned} \hat{\mu }_{\epsilon ,\mathrm{cls}}=\frac{\sum _{t=2}^{n}(X_{t}-\hat{\alpha }_{\mathrm{cls}}X_{t-1})}{n-1}. \end{aligned}$$
(21)

Step 2

Let \({\boldsymbol{\beta }}_{2}=(\phi _{0},\phi _{1})^{'}\) and \(Y_{t}=X_{t}-E(X_{t}\vert X_{t-1})\), \(g_{2}({\boldsymbol{\beta }}_{2},X_{t-1})=\hbox{Var}(X_{t}\vert X_{t-1})\). Then

$$\begin{aligned} E(Y_{t}^{2}\vert X_{t-1})= & {} E(\left[ X_{t}-E(X_{t}\vert X_{t-1})\right] ^{2}\vert X_{t-1}) \\= & {} \hbox{Var}(X_{t}\vert X_{t-1})=g_{2}({\boldsymbol{\beta }}_{2},X_{t-1}) \end{aligned}$$

where \(\hbox{Var}(X_{t}\vert X_{t-1})=\hat{\alpha }_{\mathrm{cls}}(1-\hat{\alpha }_{\mathrm{cls}})X_{t-1}+\hat{\mu }_{\epsilon ,\mathrm{cls}} -\hat{\mu }^{2}_{\epsilon ,\mathrm{cls}}+\frac{2(\hat{\mu }_{\epsilon ,\mathrm{cls}}-\phi _{1})^{2}}{(1-\phi _{0}-\phi _{1})}.\) Therefore, the CLS criterion function for \({\boldsymbol{\beta }}_{2}\) can be written as

$$\begin{aligned} Q^{*}_{n}({\boldsymbol{\beta }}_{2})= & {} \sum _{t=2}^{n} \left[ Y^{2}_{t}-g_{2}({\boldsymbol{\beta }}_{2},X_{t-1})\right] ^{2} \end{aligned}$$
(22)

The CLS estimator \(\hat{{\boldsymbol{\beta }}}_{2,\mathrm{cls}}=(\hat{\phi }_{0,\mathrm{cls}},\hat{\phi }_{1,\mathrm{cls}})^{'}\) of \({\boldsymbol{\beta }}_{2}\) are obtained by numerical solution of (22).

Step 3

Based on the results from Steps 1 and 2, the estimator \(\hat{\theta }_{\mathrm{cls}}\) of \(\theta \) can be obtained by considering the following equation:

$$\begin{aligned} \hat{\mu }_{\epsilon ,\mathrm{cls}}=\hat{\phi }_{1,\mathrm{cls}}+(1-\hat{\phi }_{0,\mathrm{cls}} -\hat{\phi }_{1,\mathrm{cls}})\theta , \end{aligned}$$

Therefore, the resulting CLS estimators is \((\hat{\alpha }_{\mathrm{cls}},\hat{\theta }_{\mathrm{cls}},\hat{\phi }_{0,\mathrm{cls}},\hat{\phi }_{1,\mathrm{cls}})^{'}\). To study the asymptotic behavior of the estimators, we make the following assumptions,

(C1) \({X_{t}}\) is a stationary and ergodic process.

(C2) \(E(X_{t}^{4})<\infty. \)

Proposition 3

Under the assumptions (C1) and (C2), the CLS estimator \(\hat{{\boldsymbol{\beta }}}_{1,\mathrm{cls}}=(\hat{\alpha }_{\mathrm{cls}},\hat{\mu }_{\epsilon ,\mathrm{cls}})^{'}\) is strongly consistent and asymptotically normal,

$$\begin{aligned} \sqrt{n}\left( \hat{{\boldsymbol{\beta }}}_{1,\mathrm{cls}}-{\boldsymbol{\beta }}_{1,0} \right) \overset{L}{\longrightarrow } N(\mathbf {0},V_{1}^{-1}W_{1}V_{1}^{-1}) \end{aligned}$$

where \({\boldsymbol{\beta }}_{1,0}=(\alpha _{0},\mu _{\epsilon ,0})^{'}\) denote the true value of \({\boldsymbol{\beta }}_{1}\), \(V_{1}=E[ \frac{\partial }{\partial {\boldsymbol{\beta }}_{1}}g_{1}({\boldsymbol{\beta }}_{1,0},X_{1})\frac{\partial }{\partial {\boldsymbol{\beta }}^{'}_{1}}g_{1}({\boldsymbol{\beta }}_{1,0},X_{1})] \), \(q_{11}({\boldsymbol{\beta }}_{1,0})=\left[ X_{2}-g_{1}({\boldsymbol{\beta }}_{1,0}, X_{1})\right] ^{2}\), and \(W_{1}=E\left[ q_{11}({\boldsymbol{\beta }}_{1,0})\frac{\partial }{\partial {\boldsymbol{\beta }}_{1}}g_{1}({\boldsymbol{\beta }}_{1,0},X_{1})\frac{\partial }{\partial {\boldsymbol{\beta }}^{'}_{1}}g_{1}({\boldsymbol{\beta }}_{1,0},X_{1})\right] \).

Proposition 4

Under the assumptions (C1) and (C2), the CLS estimator \(\hat{{\boldsymbol{\beta }}}_{2,\mathrm{cls}}=(\hat{\phi }_{0,\mathrm{cls}},\hat{\phi }_{1,\mathrm{cls}})^{'}\) is strongly consistent and asymptotically normal,

$$\begin{aligned} \sqrt{n}\left( \hat{{\boldsymbol{\beta }}}_{2,\mathrm{cls}}-{\boldsymbol{\beta }}_{2,0} \right) \overset{L}{\longrightarrow } N(\mathbf {0},V_{2}^{-1}W_{2}V_{2}^{-1}) \end{aligned}$$

where \({\boldsymbol{\beta }}_{2,0}=(\phi _{0,0},\phi _{1,0})^{'}\) denotes the true value of \({\boldsymbol{\beta }}_{2}\), \(V_{2}=E[ \frac{\partial }{\partial {\boldsymbol{\beta }}_{2}}g_{2}({\boldsymbol{\beta }}_{2,0},X_{1})\frac{\partial }{\partial {\boldsymbol{\beta }}^{'}_{2}}g_{2}({\boldsymbol{\beta }}_{2,0},X_{1})] \), \(q_{21}({\boldsymbol{\beta }}_{2,0})=[ Y_{2}^{2}-g_{2} ({\boldsymbol{\beta }}_{2,0},X_{1})] ^{2}\), and \(W_{2}=E[ q_{21}({\boldsymbol{\beta }}_{2,0})\frac{\partial }{\partial {\boldsymbol{\beta }}_{2}}g_{1}({\boldsymbol{\beta }}_{2,0},X_{1})\frac{\partial }{\partial {\boldsymbol{\beta }}^{'}_{2}}g_{2}({\boldsymbol{\beta }}_{2,0},X_{1})] .\)

Based on Propositions 3 and 4 and Theorem 3.2 in Nicholls and Quinn (1982), we have the following proposition.

Proposition 5

Under the assumptions (C1) and (C2), the CLS estimator \(\hat{{\boldsymbol{\beta }}}_{\mathrm{cls}}=(\hat{{\boldsymbol{\beta }}}_{1,\mathrm{cls}}, \hat{{\boldsymbol{\beta }}}_{2,\mathrm{cls}})^{'}\) is strongly consistent and asymptotically normal,

$$\begin{aligned} \sqrt{n}\left( \hat{{\boldsymbol{\beta }}}_{\mathrm{cls}}-{\boldsymbol{\beta }}_{0} \right) \overset{L}{\longrightarrow } N(\mathbf {0},{\Omega} ), \end{aligned}$$

where

$$\begin{aligned} \Omega =\left[ \begin{array}{cc} V_{1}^{-1}W_{1}V_{1}^{-1}&{} V_{1}^{-1}MV_{2}^{-1}\\ V_{2}^{-1}MV_{1}^{-1}&{}V_{2}^{-1}W_{2}V_{2}^{-1} \end{array} \right] , \end{aligned}$$

\(M=E(\sqrt{q_{11}({\boldsymbol{\beta }}_{1,0})q_{21} ({\boldsymbol{\beta }}_{2,0})}\frac{\partial }{\partial {\boldsymbol{\beta }}_{1}}g_{1}({\boldsymbol{\beta }}_{1,0},X_{1})\frac{\partial }{\partial {\boldsymbol{\beta }}^{'}_{2}}g_{2}({\boldsymbol{\beta }}_{2,0},X_{1})),\) and \({{\boldsymbol{\beta }}}_{0}=({\boldsymbol{\beta }}_{1,0},{\boldsymbol{\beta }}_{2,0})^{'}\) denotes the true value of \({\boldsymbol{\beta }}.\)

Based on the above proposition, we state the strong consistency and asymptotic normality of \(\hat{{\boldsymbol{\lambda }}}_{\mathrm{cls}}\) in the following proposition.

Proposition 6

Under the assumptions (C1) and (C2), the CLS estimator \(\hat{{\boldsymbol{\lambda }}}_{\mathrm{cls}}\) is strongly consistent and asymptotically normal,

$$\begin{aligned} \sqrt{n}\left( \hat{{\boldsymbol{\lambda }}}_{\mathrm{cls}}-{\boldsymbol{\lambda }}_{0} \right) \overset{L}{\longrightarrow } N(\mathbf {0},D\Omega D^{'}) \end{aligned}$$

where \({\boldsymbol{\lambda }}_{0}=(\alpha _{0},\theta _{0},\phi _{0,0},\phi _{1,0})^{'}\) denotes the true values of \({\boldsymbol{\lambda }}=(\alpha ,\theta ,\phi _{0},\phi _{1})^{'}\) and

$$\begin{aligned} D=\left[ \begin{array}{cccc} 1&{}0&{}0&{}0\\ 0&{}\frac{1}{\phi _{2}}&{}\frac{1}{\phi _{2}^{2}}&{}\frac{\phi _{0}+ \phi _{1}+\theta \phi _{2}-1}{\phi _{2}^{2}}\\ 0&{}0&{}1&{}0\\ 0&{}0&{}0&{}1 \end{array} \right] , \end{aligned}$$

and \(\phi _{2}=1-\phi _{0}-\phi _{1}\).

The brief proofs of Propositions 36 are given in Appendix.

4 Numerical Illustration

This part of the paper includes two subsections. In the first part, the performance of the estimation methods, which are presented in the previous section, is evaluated through a simulation study. Moreover, the empirical distribution of the simulated sample path in points zero and one are compared with the results of the Eqs. (15) and (16). To ensure the practical performance of the proposed process, the second part is focused on two real-life application series: the number of daily infected cases due to COVID-19 in Barbados, available in https://ourworldindata.org/covid-cases and the Poliomyelitis data from Zeger (1988) and Maiti et al. (2018).

4.1 Simulation

To conduct the simulation study, we need to generate a random sample from the INARZOIG(1) process. Based on the second stochastic representation in Zhang et al. (2016), we first generate a random sample \(\epsilon _1, \ldots , \epsilon _n\) from ZOIG\((\phi _0,\phi _1,\frac{\theta }{1+\theta })\) and then simulate \(\lbrace {X_t}\rbrace _{t=1}^{n}\) from INARZOIG(1) model. The simulation comprised the following steps:

Step 1 Generate \(Z_{1}, \ldots , Z_{n}\) form \(\hbox{Bernoulli}(1-\phi ) \),

Step 2. From Bernoulli(p) generate \(\eta _{1}, \ldots , \eta _{n}\),

Step 3. From \(\hbox{Ge}(\frac{\theta }{1+\theta })\) generate \(T_{1}, \ldots , T_{n}\),

Step 4. Use \(\epsilon _{i}=(1-Z_{i})\eta _{i}+Z_{i}T_{i}\) for i = 1, ..., n, generate \(\epsilon _{1}, \ldots , \epsilon _{n}\),

where \(\phi _0=\phi (1-p)\) and \(\phi _1=\phi p\).

According to the above algorithm, we generate a random sample (with n = 1000) from the INARZOIG(1) process with \(\phi _{0} =0.4\), \(\phi _{1} =0.2\), \(\theta =1, 5\) and \(\alpha =0.1, 0.5\). The sample path and barplot of the marginal distribution of this simulated count time series is presented in Fig. 1.

Fig. 1
figure 1

Barplots of limiting marginal distribution and sample paths of the simulated INARZOIG(1) process for \(\phi _{0}=0.4\), \(\phi _{1}=0.2\), \(\theta =1,5\) and \(\alpha =0.1,0.5\)

As can be seen from Fig. 1, for all values of \(\alpha \) and larger values of \(\theta \), the sample path tends to have larger values. But for all values of \(\alpha \) and smaller values of \(\theta \), the process has a strong tendency to return to zero or one values with less mean and variance which is clear from parts (b) and (d) of Proposition 1.

In addition, Fig. 1 shows that the number of zeros and ones increases by decreasing the values of \(\theta \).

To compare the performance of the CML and the CLS estimators, we simulate the data for n = 50, 100, 200, 500, 1000, \(\alpha =0.2\), \(\phi _0=0.1, 0.4\), \(\phi _1=0.1\) and \(\theta =1, 3\) with 10,000 replications. Mean and mean squared error (MSE) of the estimates are computed to evaluate the estimates. The function “nlminb” in “R” is used to obtaining these estimates. The results of the simulation are given in Tables 1 and 2. These tables show that the CML estimate is performed better than CLS estimate because of smaller MSE (except for a few cases).

Table 1 Mean and MSE for CML and CLS estimators for \(\theta =1\)
Table 2 Mean and MSE for CML and CLS estimators for \(\theta =3\)

In Table 3, we compare the empirical distribution of the simulated sample path with Eqs. (15) and (16) and it can be seen that for different values of n and other parameters of the model the estimated values of the proportion of zeros and ones are near to the theoretical values of them.

Table 3 Estimated values of the proportion of zeros and ones in the simulated data from INARZOIG(1) processes for different values of n

4.2 Real Data

In this subsection, using two real-life data sets, we show the applicability of the INARZOIG(1). In the first example, we use the data of new infected cases in Barbados from March 17, 2020, until January 02, 2021, and in the second, we considered the Poliomyelitis data which are the monthly cases in the USA from 1970 to 1983. To compare INARZOIG(1) model with various INAR(1) models such as OMGINAR(1) (one modified geometric INAR(1) model), PINAR(1) (INAR(1) process with Poisson-distributed innovations), ZIPINAR(1) (INAR(1) process with zero-inflated Poisson-distributed innovations), OIPINAR(1) (INAR(1) process with one-inflated Poisson-distributed innovations), ZOIPINAR(1) (INAR(1) process with zero–one-inflated Poisson-distributed innovations), ZOIPLINAR(1) (INAR(1) process with zero–one-inflated Poisson–Lindley distributed innovations, INARG(1) (INAR(1) process with geometric-distributed innovations), INARZIG(1) (INAR(1) process with zero-inflated geometric-distributed innovations) and INAROIG(1) (INAR(1) process with one-inflated geometric-distributed innovations) for these data sets. We use the AIC (Akaike information criterion), loglik (log-likelihood function), AICc (corrected version of the AIC), BIC (Bayesian information criterion), PMAE(h) (predicted mean absolute error), and the PTP(h) (percentage of true prediction ) criteria where the last two criteria are the h-step ahead forecasting accuracy measures.

To calculate the last two measures, we divide the data into two parts. The first part is used to fit the considered models, and the second part which is the last 20 observations is used to compute the \(\hat{X}_{t+h}\) and then the PMAE(h) and PTP(h) are computed for h = 1.

4.2.1 COVID-19 Data in Barbados

In this subsection, using a real data set, we show the applicability of the INARZOIG(1). We use the data of new infected cases in Barbados from the 17th of March 2020 until the 2nd of January 2021. This data set has 292 observations for which 148 (51%) of observations are zero and 64 (22%) of observations are one, and the other 80 (27%) of observations had infected cases more than one. The mean and variance of observations are 1.35 and 5.60, respectively, and hence, the Fisher index of them is given as 4.15 and it shows that the data are overdispersed. The barplot, series plot, ACF and PACF are plotted in Figs. 2 and 3, respectively. It is noted that the PACF yields significant lags with values greater than one, and it seems that the INAR with an order greater than 1 is suitable for the data. To determine the order of the process, we use the portmanteau tests introduced by Foroughi et al. (2021) with m = 2, 3, 4, 8, 12. We want to test the null hypothesis that the data set follows the INAR(1) versus the alternative hypothesis that the data follow the INAR(p) with \(p>1\). The p values are reported in the Table 4 and show that order 1 is appropriate for this data set.

Fig. 2
figure 2

Barplot and series plot of the new infected cases in Barbados

Fig. 3
figure 3

ACF and PACF of the new infected cases in Barbados

Table 4 The p values of the portmanteau tests for different values of m

The relative frequencies of zeros and ones and Fig. 2 show the inflation in zeros and ones. The extra zeros and ones motivated us to use INARZOIG for this data set. To show the adequacy of the INARZOIG(1), we compare the proposed model with OMGINAR(1), PINAR(1), ZIPINAR(1), OIPINAR(1), ZOIPINAR(1), ZOIPLINAR(1), INARG(1), INARZIG(1) and INAROIG(1) based on the mentioned criteria. The results in the Table 5, show that the INARZOIG(1) has the best fit since it has the largest Loglik and smallest values of other criteria except for BIC which are indicated by bold numbers. In the sense of BIC, INARZIG(1) is the best model and to show that INARZOIG(1) is more suitable for this data set than the INARZIG(1), we use the likelihood ratio test (LRT) with the following hypothesis.

$$\begin{aligned} \left\{ \begin{array}{l} H_{0}: \phi _1=0 \\ H_{1}: \phi _1 \ne 0 \\ \end{array}\right. \end{aligned}$$
(23)

The LRT statistics is equal to 3.937 and the critical value at level 0.05 is equal to 3.841. Hence, we can conclude that the null hypothesis rejects and the zero-and-one-inflated distribution is more suitable than zero-inflated model for this data set. Also, we calculated two forecasting accuracy measures; however, they are the same for all models and PMAE is equal to 4.45 and PTP is equal to 20.

Table 5 Parameter estimations and their standard errors and Loglik, AIC, AICc and BIC criteria for compared models that are fitted to daily new infected cases of COVID-19 in Barbados

The last figure shows the daily new infected cases of COVID-19 in Barbados and their predicted values using INARZOIG(1). As can be seen, the predicted values are closed to the original data, which indicates the good performance of the proposed fitted model in the sense of forecasting (Fig. 4).

Fig. 4
figure 4

Daily new infected cases of COVID-19 in Barbados and their predicted values using INARZOIG(1)

4.2.2 Poliomyelitis Data

In this subsection, we considered the Poliomyelitis data which are the monthly cases in the USA from 1970 to 1983. These data were analyzed by Zeger (1988) for the first time. This data set has 168 observations for which 64 (38%) of observations are zero and 55 (32%) of observations are one, and the other 49 (30%) of observations had monthly cases more than one. The mean and variance of observations are 1.33 and 3.50, respectively, and hence, the Fisher index of them is given as 2.63. The value of the Fisher index indicates that the data are overdispersed. Recently, Maiti et al. (2018) considered these data and fitted most of the existing INAR(1) models including Poisson INAR(1), overdispersed models such as geometric INAR(1) and compound Poisson INAR(1), zero-inflated models like zero-inflated and zero-modified INAR(1) and their proposed sub-model, the one-modified geometric INAR(1)(OMGINAR(1)). Using some goodness of fit criteria and 1-step ahead forecasting accuracy measures, they showed that OMGINAR(1) had the best fit among all considered models.

Now, we analyze the data further. First, we plot the barplot and the series plot in Fig. 5. These figures and the frequencies of the observed zeros and ones show the extra number of zeros and ones. This fact and the overdispersion of the data, motivated us to fit the INARZOIG(1) model into this data set. The ACF and PACF of the data are plotted in Fig. 6.

Fig. 5
figure 5

Barplot and series plot of monthly cases of poliomyelitis data in the USA from 1970 to 1983

Fig. 6
figure 6

ACF and PACF of the monthly cases of poliomyelitis data in the USA from 1970 to 1983

Based on the conclusions of Maiti et al. (2018) about the considered data set, we compare our model with OMGINAR(1) and used the reported criteria in that paper for this model. Also, we considered the ZOIPLINAR(1), introduced by Mohammadi et al. (2021), as another alternative to compare with. We use the Loglik, AIC, AICc, and BIC criteria, and the results are reported in Table 6. As can be seen, the INARZOIG(1) model has the largest Loglik and smallest AIC, AICc, but the value of the BIC of the OMGINAR(1) is the smallest BIC. Nevertheless, based on Raftery (1995), since the difference between these values is less than 2, it is not significant and the other criteria show that the INARZOIG(1) is more suitable for this data set. We can conclude that our introduced model has the best fit on this data set; however, the forecasting accuracy measures are the same when PMAE is equal to 0.95 and PTP is equal to 45 for all considered models. Moreover, from Fig. 7 that shows the plot of the Poliomyelitis data and their predicted values, it can be seen that the predicted values are found to be almost close to the real data. This figure indicates the good performance of the INARZOIG(1) in the sense of forecasting, too.

Table 6 Parameter estimation and Loglik, AIC, BIC and AICc criteria for compared models for Poliomyelitis data
Fig. 7
figure 7

Poliomyelitis data and their predicted values using INARZOIG(1)

5 Conclusion

This paper analyzes the zero-and-one-inflated time series using a flexible INAR(1) model based on the zero-and-one-inflated geometric-distributed innovation. The main properties of this novel INAR(1) process are established, and its model parameters are estimated via the CML and CLS approaches. The performance of the two estimation techniques is assessed through some Monte Carlo experiments wherein both approaches are shown to provide consistent estimates, but with the CML approach providing lesser biased estimates.

Furthermore, the INARZOIG(1) model is applied to analyze the COVID-19 series from Barbados, which is found to consist of a more frequent number of zeros and ones. Also, using the portmanteau test we indicate that order 1 is suitable for this data set. As a next example, this model is applied to another real data set which is the monthly cases in the USA from 1970 to 1983 that was analyzed by Zeger (1988) for the first time. Under the data applications, the INARZOIG(1) model is shown to provide better fitting criteria than the existing competing models. Evidently, the statistical performance of the INARZOIG(1) depends on the nature of the data as well, but overall, the INARZOIG(1) model has a worthy contribution to the class of INAR models.