Empirical likelihood based on synthetic right censored data
Introduction
Liang et al. (2019) proposed a Mean Empirical Likelihood (MeanEL) method based on synthetic pairwise mean data. Empirical simulation results in Liang et al. (2019) showed that this MeanEL method provides better results for heavy-tail or highly-skewed distributions and for exponentially tilted likelihood. However, theoretical comparisons of MeanEL and other existing EL methods, such as Bartlett correction Empirical Likelihood (BEL) in DiCiccio et al. (1991), the adjusted empirical likelihood (AEL) in Chen et al. (2008) and extended empirical likelihood method (EEL) in Taso and Wu (2013), were not established in Liang et al. (2019). This paper will extend such MeanEL approach to right-censored data analysis. Theoretical justification on why using such synthetic data can provide better coverage probability accuracy is also discussed in this paper.
Assume that independent and identically distributed random observations with an unknown distribution function are subject to right censoring, so that we only observe where are censoring times with distribution , independent of survival times . We are interested in the estimation problem for a parameter . The true parameter value is a unique solution of the equation for some function . In this paper, we focus on estimating equations having true parameter value as the unique solution, since such estimating equations will provide (asymptotically) unbiased estimates. There are many such examples of in the literature and the solution’s existence and uniqueness are discussed therein Newey and Smith (2004). Different function corresponds to different parameter of interests. For example, if we choose , then is the expectation of , i.e. . Other examples include: [1.] corresponding to being the mean residual life time at given time ; [2.] corresponding to being the cumulative hazard function at given time ; [3.] corresponding to being the quantile function at given time .
Based on synthetic data introduced in Liang et al. (2019), if is observed, the pairwise mean synthetic data set can be defined as, which can also be written as with . Based on the data set (1.3), the MeanEL ratio for is Under some regularity assumptions, Liang et al. (2019) proved the mean empirical log-likelihood ratio Therefore, the confidence interval can be constructed as .
However, the above approach is not readily available under censoring, since we only observe () instead of and we cannot pairwise index variable directly. Therefore, we need to develop a new approach to construct, under right censoring, a synthetic data set, an estimating equation and a MeanEL ratio for .
This paper is organized as follows. In Section 2, we will present the MeanEL methodologies for right censored data and show that the MeanEL still has a limiting distribution, which can be used to construct a MeanEL-based confidence interval. Simulation studies are presented in Section 3 and they demonstrate that MeanEL outperforms the existing methods, especially for heavy-tail distributions. Section 4 provides a real data analysis. A theoretical high-order accuracy justification of different methods are provided in Section 5.
Section snippets
Methodology for censored data
Let be the ordered -values and be the concomitant of the th order statistic, that is if . Let , , . We define the pairwise mean data set as In this new data set , only those observations satisfying can be treated as uncensored. The following equation can be easily proved, Based on this equation, the MeanEL ratio can be defined as
Simulation studies
For a given sample size , we generate lifetime observations from a specific distribution and censoring time observations from certain censoring distribution . Then, based on the simulated data, we can compare the performance of IC-confidence interval (He et al., 2016), ScaledEL-confidence interval (Wang and Jing, 2001) and MeanEL-confidence intervals , proposed in the previous section.
In our simulation, the parameter of interests, , is the mean of , therefore
Real data analysis
In this section, we compare our proposed methods with existing methods using the primary biliary cirrhosis(PBC) dataset, which is described in Fleming and Harrington (1991) and originates from a Mayo Clinic trial between 1974 to 1984. It contains the survival time of 312 patients and the status variable, which indicates if the patients’ survival times are censored. We use this dataset to illustrate our proposed method described in Section 2. Fig. 2 presents the 95% confidence intervals for the
Theoretical comparisons
In this section, we present a theoretical comparison of MeanEL and other EL methods. Define with . Following Liu and Chen (2010), we assume that , then the original EL can be written as and the Bartlett correction uses the corrected statistics where , , and . Then the corrected statistic gives second order
CRediT authorship contribution statement
Wei Liang: Conceptualization, Methodology, Simulation, Writing. Hongsheng Dai: Methodology, Development, Main contribution in editing.
Acknowledgments
The first author is supported by the National Natural Science Foundation of China, 11701484; the Fundamental Research Funds for the Central Universities in China , 20720190067.
References (13)
- et al.
Mean empirical likelihood
Comput. Statist. Data Anal.
(2019) - et al.
Adjusted empirical likelihood and its properties
J. Comput. Graph. Statist.
(2008) - et al.
Empirical likelihood is Bartlett-correctable
Ann. Statist.
(1991) - et al.
Counting Processes and Survival Analysis
(1991) - et al.
Empirical likelihood for right censored lifetime data
J. Amer. Statist. Assoc.
(2016) U-Statistics: Theory and Practice
(1990)
Cited by (1)
A review of recent advances in empirical likelihood
2023, Wiley Interdisciplinary Reviews: Computational Statistics