Abstract

In this paper, we study the use of the mean empirical likelihood (MEL) method in a first-order random coefficient integer-valued autoregressive model. The MEL ratio statistic is established, its limiting properties are discussed, and the confidence regions for the parameter of interest are derived. Furthermore, a simulation study is presented to demonstrate the performance of the proposed method. Finally, a real data analysis of dengue fever is performed.

1. Introduction

Integer-valued time series data are commonly encountered in many fields, such as economics, finance, actuarial science, medicine, and epidemiology (e.g., the number of patients in a hospital at a specific point of time and the number of persons in a queue waiting for service at a certain moment). Related research on integer-valued time series started in the 1980s, in which two main methods are used: a state-space model based on an unobserved “state” process and a thinning model based on a thinning operation “.” Regarding state-space models, we can refer to the paper by Fukasawa and Basawa [1]. Regarding thinning models, we can refer to the paper by Steutel and Harn [2], which mainly proposed the binomial thinning operation. Let be a non-negative integer-valued random variable and . Then, the binomial thinning operator “” is defined aswhere is an i.i.d. Bernoulli random sequence where , which is also independent of . Based on the thinning operator “,” a first-order autoregressive process with count- or integer-valued data (INAR(1)) was defined by Al-Osh and Alzaid [3] as follows:where is a sequence of i.i.d. non-negative integer-valued random variables with mean and variance and is independent of . The INAR(1) model has been discussed by many authors. Al-Osh and Alzaid [4] introduced a family of models for a stationary sequence of dependent binomial random variables and discussed the existence of a stationary distribution for the binomial AR(1) process. Al-Osh and Aly [5] presented AR(1) models with negative binomial and geometric marginals and investigated some properties of the processes. McKenzie [6] described some simple models that may be used for modelling or generating sequences of dependent discrete random variates with negative binomial and geometric univariate marginal distributions. Later, McKenzie [7] discussed the problem of defining a practically useful representation for the innovation process of a first-order autoregression with a negative binomial marginal distribution. Moreover, McKenzie [8] demonstrated that the powerful Markov property, which greatly simplifies the distributional structure of finite autoregressions, is analogous to (non-Markovian) finite moving-average processes. Furthermore, McKenzie [9] developed and investigated a family of models for discrete-time processes with Poisson marginal distributions. Alzaid and Al-Osh [10] investigated some properties of INAR(1) processes.

In some practical applications, the parameter may vary with time. For example, let denote the number of unemployed people in month t. Here, could potentially satisfy an INAR model, where is the number of unemployed people in month t who were unemployed in the previous month and represents the number of newly unemployed people in the current month. Here, represents the unemployment rate, which may be affected by economic conditions and other factors and can vary randomly over time. Zheng et al. [11] introduced a first-order random coefficient integer-valued autoregressive (RCINAR(1)) process as follows:where is an i.i.d. sequence with cumulative distribution function (CDF) on ; is an i.i.d. non-negative integer-valued sequence with probability mass function (PMF) , in which ; and , , and are independent. , where is an i.i.d. Bernoulli random sequence and ; is also independent of .

Zheng et al. [11] established the ergodicity of the process, obtained the moments and autocovariance functions, and derived the conditional least-squares (CLS) and quasi-likelihood estimators of the model parameters. In recent years, RCINAR(1) models have been discussed in many studies. Roitershtein and Zhong [12] studied the asymptotic behaviour of the RCINAR(1) model in the case where the additive term in the underlying random linear recursion belongs to the domain of attraction of a stable law. Zhang and Wang [13] presented the explicit expressions for the higher-order moments and cumulants of the RCINAR(1) process. Zhao and Hu [14] applied the least-squares method to estimate the parameters in the RCINAR(1) process. Kang [15] considered the problem of testing for parameter changes in RCINAR models. Li et al. [16] introduced a first-order random coefficient integer-valued threshold autoregressive process based on binomial thinning. Bakouch et al. [17] introduced a new stationary first-order integer-valued autoregressive process with random coefficient and zero-inflated geometric marginal distribution. Zhang et al. [18] introduced the RCINAR(1) process with generalized negative binomial marginals. Yu et al. [19] proposed a new bivariate RCINAR(1) (BRCINAR(1)) process with dependent innovations.

Many research methods have been applied to INAR models, among which the empirical likelihood (EL) method has been the main focus in recent years. The EL method, introduced by Owen [20] and further studied by Owen [21] and others, is a nonparametric statistical method. The EL method is a useful tool for statistical inference and has been successfully applied to many areas, such as linear regression models [22], generalized linear models [23], generalized estimation equations [24], dependent processes [25], semiparametric varying-coefficient partially linear regression models [26], and the limit theory of RCINAR(1) processes [27]. Zhao and Yu [28] estimated the variance of the random coefficient in the RCINAR(1) process by the EL method.

Although the EL method has many advantages and has been widely applied in various scenarios, there are some problems with this method, for example, the empirical likelihood ratio confidence regions may have poor accuracy, especially in small-sample and multidimensional scenarios. The literature discusses many attempts to solve this problem. DiCiccio et al. [29] proved that the EL is Bartlett correctable. Chen et al. [30] introduced the adjusted EL (AEL), and Taso and Wu [31] introduced the extended EL (EEL). The above methods provide improved results in small-sample scenarios, but the calculation is complex and involves a new parameter estimation method. Liang et al. [32] introduced the mean empirical likelihood (MEL) method, which is simple and rapid to implement and much more accurate than the previous EL methods.

In this paper, we focus on the use of the MEL method for the RCINAR(1) model (3). The MEL ratio statistic is derived, and its limiting properties are discussed. Specifically, the confidence region is derived for the parameter of interest.

The rest of this paper is organized as follows. In Section 2, we introduce the main results. In Section 3, we present some simulation results. In Section 4, we apply our method in the dengue fever cases data. Finally, in Section 5, we prove the main results.

2. Mean Empirical Likelihood for an RCINAR(1) Process

In this section, we will discuss how to use the MEL method for RCINAR(1) models (3). Zheng et al. [11] noted that the process is an irreducible, aperiodic, and positive recurrent (and hence ergodic) Markov chain.

Let , and ; note that they are all assumed to be finite. Let . We use the MEL method to estimate the unknown parameter . Based on the sample , Zheng et al. [11] derived the conditional least-squares (CLS) estimator of the model parameter. The CLS estimator of is obtained by minimizing over , where

Note that ; then,

By taking the derivative of with respect to , we obtain the estimating equation:where

Let . The elements of set are denoted by ; let be equal to the number of elements in , and it is easy to understand that . We define the MEL ratio statistic of aswhere . According to the method of Lagrange multipliers, letwhere and are the Lagrange multipliers. Fromwe know that and . Then, from (10), we obtain . Hence, , where satisfies

Thus, the log EL ratio statistic has the form

Further, let . The MEL ratio statistic is defined as

To obtain the limiting properties of , we impose the following assumptions:(C1) is a strictly stationary and ergodic RCINAR(1) process(C2)

The limit distribution of is established in the following theorem.

Theorem 1. Under Assumptions (C1) and (C2), we havewhere is a chi-square distribution with 2 degrees of freedom.

According to Theorem 1, we can construct the confidence region for parameter . The confidence region of iswhere is the -quantile of the chi-square distribution with 2 degrees of freedom.

3. Simulation Results

In this section, we conduct simulation studies to compare the MEL confidence region with the EL, AEL, and EEL results.

Consider the RCINAR(1) modelwhere and .

We fixed at 1 and then used the above model to generate data. We take and . Four different sample sizes (n = 20, 30, 50, and 100) are investigated, and the nominal confidence levels are chosen as 0.95 and 0.90. All the simulations are based on 1000 replications. We evaluate the coverage probability of the confidence regions, and the results are summarized in Tables 14.

It can be seen from the Tables 14 that the coverage probability of the confidence region approaches the confidence levels (0.95 and 0.90) as the sample size n increases. The MEL method has similar performance to the EEL method. The MEL and EEL coverage probabilities are much larger than nominal levels when the sample size is small. In all cases, the MEL method is uniformly better than the EL method, and it is much more accurate when the sample size is small.

In order to further study the performance of MEL method, we give the figure of the confidence region for n = 20, 30, 50, and 100 when and (Figure 1). At the same time, we calculated the CI length of and , and the results are summarized in Table 5.

It can be seen from the Table 5 and Figure 1 that the confidence region is relatively large when the sample size is small, so the coverage probability of the confidence region is relatively large. However, as the sample size increases, the confidence region becomes smaller and the length of the confidence interval shortens.

4. Real Data Analysis

In this section, we apply our proposed methods to analyse the monthly counts of dengue fever cases in China from January 2004 through April 2012, as reported by the Chinese Center for Disease Control and Prevention (http://www.chinacdc.cn). The data are plotted in Figure 2 and consist of 100 observations, which are denoted by . The plots of the autocorrelation function (ACF) and partial autocorrelation function (PACF) for the series are given in Figures 3 and 4, respectively. The corresponding plots of the sample ACF and PACF indicate an AR(1)-like autocorrelation structure.

In Figure 5, based on the observation data , we give the figure of the MEL ratio confidence region when the confidence level is 0.95. Through the calculation, we have that the least-squares estimation , which is denoted by in Figure 5. From Figure 5, we can see that is in the MEL ratio confidence region.

5. Proof of Theorem 1

In this section, we present the proof of Theorem 1. To obtain the proof, we need the following lemmas.

Lemma 1. Assume that (C1) and (C2) hold. Then,

Proof. Note thatThus, according to Lemma 4.2 of Zhang et al. [33], we know that Lemma 1 holds.

Lemma 2. Assume that (C1) and (C2) hold. Then,

Proof. Note thatThus, by Assumption (C1), we know that Lemma 2 holds.

Lemma 3. Assume that (C1) and (C2) hold. Then,where , , , and .

Proof. Note thatThus, by Lemma 2.1 and Lemma 4.1 of Zhang et al. [33], we know that Lemma 3 holds.

Lemma 4. Assume that (C1) and (C2) hold. Then,

Proof. Note thatBy the strong law of large numbers and Assumptions (C1) and (C2), we have thatTherefore, by (24)–(26), we know that Lemma 4 holds.

Lemma 5. Assume that (C1) and (C2) hold. Then,

Proof. Let , where . From (11), we know thatwhere . By Lemma 3, we have , where and are the largest and smallest eigenvalues, respectively, of . Next, we provide the proof in three steps.(i)Step 1. We prove thatTo prove (29), we need to prove thatNote thatFrom Lemma 2.1 of Zhang et al. [33], we know thatThen, we have , so (30) holds.(ii)Step 2. We prove thatFrom (28), we haveFrom Lemma 2.1 of Zhang et al. [33], it is easy to see that . Hence, according to (34), (33) holds.(iii)Step 3. We prove thatWe have proved that . Let , and we have . Note that , . Hence, (35) is proved, so Lemma 5 holds.

Lemma 6. Assume that (C1) and (C2) hold. Then,

Proof. From (11), we know thatHence,To prove Lemma 6, according to Lemma 3, we need to prove thatNote thatThe proof of Lemma 6 is complete.

Proof of Theorem 1. By the Taylor expansion, we know thatwhereTherefore,According to Lemma 2.1 and Lemma 4.1 of Zhang et al. [33] and Lemma 3, we know that Theorem 1 holds.

Data Availability

The data used to support the findings of this study have not been made available because our research data come from computer simulation.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (nos. 11871028, 11731015, and 11901053), the Natural Science Foundation of Jilin Province (nos. 20170101057JC and 20180101216JC), and the Program for Changbaishan Scholars of Jilin Province (2015010).