Abstract

This paper describes two prediction methods for predicting the non-observed (censored) units under progressive Type-II censored samples. The lifetimes under consideration are following a new two-parameter Pareto distribution. Furthermore, point and interval estimation of the unknown parameters of the new Pareto model is obtained. Maximum likelihood and Bayesian estimation methods are considered for that purpose. Since Bayes estimators cannot be expressed explicitly, Gibbs and the Markov Chain Monte Carlo techniques are utilized for Bayesian calculation. We use the posterior predictive density of the non-observed units to construct predictive intervals. A simulation study is performed to evaluate the performance of the estimators via mean square errors and biases and to obtain the best prediction method for the censored observation under progressive Type-II censoring scheme for different sample sizes and different censoring schemes.

1. Introduction

Studying new lifetime models become necessary and extensive as many applications appeared in natural sciences. Over the last four decades, many authors focused their works on generating new lifetime distributions that will fit the experimental data, for example, medical, engineering, social sciences, reliability analysis, and others. In literature, those new models seemed to possess good properties and some were superior relative to the original ones. Many generalized classes of lifetime distributions are implemented to describe various phenomenal data (one may refer to Kumaraswamy [1] and Marshall and Olkin [2]). The new family of distributions should include the original distribution as a submodel and is expected to give more flexibility to the original model. In our work, we consider a new form of Pareto distribution which was introduced by Bourguignon et al. [3]. The new Pareto model generalizes the original Pareto distribution, and it seemed to be more simple in some mathematical calculations and had new characteristics, see for example reference No. [4], Almetwally and Haj Ahmad (2020).

In some life tests and reliability experiments, units may be removed or lost from the experiment before its failure. The loss can be unplanned, like in accidental damage of an experimental unit, or if a unit under study drops out. Sometimes, the experiment must stop due to the unavailability of testing facilities. Most often, the removal of units from an experiment is preplanned and is made to reduce time and cost limitations. The benefit of progressive censoring lies in its efficient utilization of the available resources, so when we start an experiment if any of the surviving units are removed early, then we can use them for other tests or experiments. In reliability and life testing experiments, one of the main objectives is to obtain inference about the unknown parameters of the lifetime distribution under consideration. Sometimes, this is based on certain censored observations (see Cohen [5]). Estimation and prediction problems arise quite naturally in a lot of real-life situations, and in many studies, researchers are interested in providing estimates for unknown parameters and/or making some prediction inference about censored (future) observations.

The most commonly used censoring schemes are (i) progressive Type-I and (ii) progressive Type-II censoring schemes. One can refer to the books of Balakrishnan and Cramer [6] and Balakrishnan [7]. Recently, several authors are interested in studying parameter inference of different distributions under progressive Type-II censoring scheme (PC) (see, for example, Kundu [8], Pradhan and Kundu [9], Maurya et al. [10], and Bdair et al. [11]). In addition, inference with other censoring schemes appeared in literature with different lifetime models, such as hybrid Type-I progressive censoring, adaptive Type-II progressive censoring, Type-II hybrid censoring, and others (see, for example, Bdair and Haj Ahmad [12]; Haj Ahmad et al. [13]; Salah et al. [14] Almetwally et al. [15]; and Sabry et al. [16]). Still there is much space for more work with different censoring schemes under new generalized models.

In this paper, we restrict our attention on the case of censored samples under progressive Type-II censoring scheme (PC) and find the point and interval estimation of the unknown parameters of the new Pareto distribution (NPD); then, we study the prediction problem of the future data (unobserved).

For the NPD, we can write the probability density function (PDF) aswhere and are the shape and scale parameters, respectively.

The cumulative distribution function (CDF) of NPD is given by

In this paper, we mainly work on two objectives. First, we find the point and interval estimation of NPD’s parameters and using the maximum likelihood and the Bayes estimates under PC and compare the effectiveness of the two methods of estimation numerically by simulation analysis using the R package. Second, we consider the problem of predicting unobserved (future) data based on the observed (available) data. Therefore, we consider two prediction methods: (i) the best unbiased predictor (BUP) and (ii) the Bayes predictor (BP). We construct predictive intervals (PIs) for the unobserved (future) data that are censored from the experiment. Numerical analysis and simulation are used to compare the efficiency of prediction methods under consideration.

The PC is a generalized censoring scheme for the well-known Type-II right censoring. PC gained great attention in the last twenty years. We can simply describe this censoring scheme as follows: let denote the real outcomes of independent and identically distributed units which are under a life test experiment. Also, suppose that are some fixed non-negative integers such that . We need to observe units and then remove the remaining units progressively according to the censoring scheme . The censoring occurs progressively in stages, which offer failure times for the observed units. When the first failure time (the first stage) occurs, of the surviving units will be randomly removed or censored from the experiment. When the second failure time (the second stage) occurs, of the surviving units are randomly removed from the experiment. Finally, when the failure time (the m th stage) occurs, all the rest of units are withdrawn from the experiment. We call this as progressive Type-II right censoring scheme . We can verify easily that Type-II right censoring scheme and the complete sampling scheme are a special case of PC by choosing and , respectively.

Prediction is very important in statistics, and many authors studied the prediction problem and its applications in real-life data (see, for example, Kaminsky and Rhodin [17]; Al-Hussaini [18]; Madi and Raqab [19]; Raqab et al. [20]; and Bdair et al. [11]). Prediction’s idea depends on predicting the future order statistics based on the observed (obtained) sample data. Some authors studied the problem of estimation and prediction under different types of censored data from different models (see, for example, Kim et. al [21]; Kundu [8]; and Kundu and Raqab [22]). Raqab et al. [23] studied the prediction of the remaining time for the generalized Pareto distribution under a progressive censored sample. Belaghi et al. [24] considered estimation and prediction problems for the Poisson-exponential distribution under Type-II censored data. Bdair et al. [11] used Bayes prediction to predict future values of a progressively censored sample under flexible Weibull distribution.

The rest of the paper is organized as follows. In Section 2, we obtain the MLEs for the two parameters of NPD. The Bayesian estimation method is used to estimate the unknown parameters in Section 3. In Section 4, we handle the point and interval prediction problems for the unknown observations from the censored sample using the best unbiased predictors (BUPs) and the Bayesian predictor (BP). In Section 5, numerical comparisons are performed via simulation analysis. Finally, some conclusions are drawn in Section 6.

2. Maximum Likelihood Estimation

In this section, we use the classical method of estimation which is the maximum likelihood method (MLE) for estimating the two unknown parameters of NPD under PC scheme. Let with denote the observations under PC from a sample of size drawn from a NPD with PDF and CDF given by equations (1) and (2), respectively. Based on a progressive Type-II censored sample the likelihood function is given bywhere (see Balakrishnan and Aggrawala [25]).

Using equations (1) and (2), we obtain

The logarithmic likelihood function of NPD is

We can notice that is monotonically increasing with . Hence, since , the MLE of will be , where is the first progressive ordered statistic.

From the above logarithmic likelihood equation, we find the partial derivative with respect to parameter and then equate it to zero to obtain the MLE of , and hence is the solution of

Numerical analysis and simulation are used to study the performance of MLE with respect to mean square errors (MSEs) and biases. We can observe the asymptotic confidence interval (CI) for and using asymptotic properties of the MLE such that , where and is the Fisher information matrix, i.e.,

The second partial derivatives are obtained as

The variances of the MLEs can be found from the asymptotic property of MLE so that , and , where is the determinant of information matrix . The asymptotic confidence intervals for and are given asrespectively, where is the lower percentile of the standard normal distribution.

3. Bayes Estimation

In the Bayesian method, all parameters are considered as random variables with a certain distribution called prior distribution. But if the prior information is not available which is usually the case, then we need to select one. Since the selection of prior distribution is important in parameter estimation, we chose the independent gamma distributions and , respectively, for the prior of and . Choosing this prior density is due to the fact that gamma prior has flexible characteristics as a non-informative prior, especially when the values of the hyperparameters are assumed to be zero. The suggested gamma distributions have the following densities:where , and are the hyperparameters of prior distributions and all are positive real constants.

The joint prior of and is

The joint posterior of and iswhere is the likelihood function of NPD under progressive censored samples as in equation (4). Substituting and for NPD under PC, the joint posterior density can be written aswhere and represents the PDF of gamma distribution.

Therefore, the Bayes estimate of any function of and , say , under the quadratic loss function is . Since it is difficult to compute this expected value analytically, we decided to use the Markov Chain Monte Carlo technique (MCMC) (see Karandikar [26]).

Gibbs sampling method will be used to generate a sample from the posterior density function and compute Bayes estimates. For the purpose of generating a sample from the posterior distribution, it is assumed that the PDFs of prior densities are as described in equation (10). The full conditional posterior densities of and and the data are given by

The full conditional distributions above cannot be simplified to well-known distributions, and hence we cannot generate and from these distributions in a direct way using standard methods. We can solve this problem by using the M-H algorithm (for further details about this algorithm, one may refer to Metropolis et al. [27] and Hastings [28]). The main point now is to decrease the number of rejections as possible. The algorithm below describes the M-H algorithm based on selecting the normal distribution as the main distribution which is used to find the Bayes estimators in addition to constructing the credible intervals for and . The algorithm is summarized as follows:(1)Start with initial values .(2)Use M-H algorithm on equation (14) to generate a posterior sample for the parameters and .(3)Repeat step 2 times and obtain .(4)When we obtain the posterior sample, we have Bayes estimates of and with respect to quadratic loss function:where is the Markov Chain’s burn-in period.

4. Prediction

In many fields of life sciences, dealing with the problem of predicting unobserved, censored, or lost observation from the experiment has had a great attention so far (one may refer to Kaminsky and Nelson [29]; Raqab et al. [20]; Raqab et al. [23, 24]; and Bdair et al. [11]). Here we study two methods of prediction, namely, (i) the best unbiased predictor (BUP) and (ii) the Bayes predictor (BP).

4.1. Best Unbiased Predictor

In this section, our goal is to predict the lifetimes of the order based on observations under PC, . Now by using the Markovian property of progressive Type-II censored order statistics, acts similarly as the order statistic from a sample of size under truncated distribution at with PDF , where , and hence we obtainwhere . Now substituting the PDF and CDF of NPD into equation (16) and after some simplifications, we observe that

Since , the term in equation (17) can be represented as a series expansion using well-known binomial theorem, so the conditional density is rewritten as

The best unbiased predictor (BUP) of is the expected value , that is,where . Using integration techniques and binomial expansion in the integral part, equation (19) reduces towhere , and . If we assume that the parameters and are unknown, the BUP of will bewhere and are the MLEs of and , respectively, and .

4.2. Bayesian Prediction

Bayes prediction (BP) of the censored observation from the future sample depends on the actual observed sample which is known as informative sample. For that reason, we consider the estimation of posterior predictive density (PPD) of the order . The posterior predictive density of given the observed censored data is given bywhere is the conditional density of given , and data , which is given in equation (18), and is the joint posterior given in equation (13). Now the Bayes predictor (BP) of under squared error loss function (SEL) can be obtained as

The form of the PPD in equation (23) is not easy to compute; therefore, Bayesian predictive estimates are difficult to find explicitly. Thus, there is a need to use the MCMC sample technique which was described in Section 3. The MCMC technique is conducted to generate samples from the PPD. These samples are of the form and are obtained using the M-H methods and Gibbs sampling. The sample-based predictor of is given byand hence after integration techniques and algebraic simplifications, the sample-based predictor can be written aswhere and .

From the above PPD, one can obtain a two-sided predictive interval for . For that purpose, we need to find the predictive survival function of at point , which can be defined as

Under the SEL function, the predictive survival function of is given by

The predictive survival function in equation (28) cannot be easily evaluated analytically, and hence numerical approximation technique will be preferable in this case. The MCMC samples can be used to approximately evaluate equation (28), so let , and then the simulated estimator for the predictive survival function can be written aswhere .

Now, the predictive interval of is found by solving the following non-linear equations using a suitable numerical technique which is given in the following equation:where (L) denotes the lower bound and (U) denotes the upper bound.

5. Simulation Analysis

In this section, we perform a simulation analysis to check the performance of the Bayes estimators compared with the classical estimators obtained by the MLE based on PC with NPD lifetimes. Also, we compute the best unbiased predictor and Bayes predictor for the missing data with respect to the observed PC. In Bayes estimation, we use the square error loss function SEL. We compute the mean square errors (MSEs) and the biases for Bayes and MLE estimators based on 10000 replications using R package. In estimation and prediction, we suggest fixed values of the parameters to be and sample sizes to be , in order to generate progressive Type-II censored data. Also, under PC, we obtain the point predictors and the 95% prediction intervals for the missing order statistics ; , .

The MLE and Bayes estimators for the NPD parameters’ and their corresponding CI lengths, in addition to the results of prediction problem, are all reported in Tables 16. The following censoring schemes are suggested:(1)Scheme 1: (2)Scheme 2: (3)Scheme 3: .

In Tables 1 and 2, we show the MLEs and the Bayes estimates of and under different censoring schemes. Numerical results of estimators and their corresponding biases and MSEs are computed using the algorithm presented in Sections 2 and 3. In Tables 3 and 4, we present numerical comparisons between the average lengths (AL) and the coverage percentages (CP) of the credible intervals and asymptotic intervals under NPD parameters. In Tables 5 and 6, we present MLE and Bayes point predicted values and the prediction intervals for the missing order statistics , based on the observed sample of size with censoring scheme , for all schemes described above under the loss function SEL. The MCMC samples , the point BUP, and BP for the missing order statistics in censoring stage , , are computed. The lower bound and upper bound of prediction interval for the missing order statistics are also computed.

From Tables 1 and 2, we observe many attractive results that are summarized as follows:(i)The best point estimation method for estimating the shape parameter is Bayesian method under SEL, and this result is observed since it has minimum biases and minimum MSEs.(ii)For the scale parameter , the MLE proves to have the minimum biases and MSEs, and hence it is preferable to be used for point estimation of .(iii)When comparing the efficiency of censoring schemes with respect to biases and MSEs, it appears that scheme 3 performs well when estimating , while scheme 1 is better than others for estimating .

For interval estimation of NPD parameters, we use asymptotic CI from the MLE method and the credible CI from the Bayesian method under SEL. A simulation analysis with some numerical methods and MCMC technique show some results that appear in Tables 3 and 4. Comparisons between the two CIs are conducted depending on the average interval lengths and the coverage percentages (CP) as well, and hence the main results are summarized below:(i)The asymptotic CI has less average interval length for estimating than credible CI based on CI average lengths, under the three suggested censoring schemes and sample sizes 50 and 100.(ii)The credible CI has higher CP for estimating than the asymptotic CI, under the three suggested censoring schemes and sample sizes 50 and 100.(iii)The credible CI has less average interval length for estimating than asymptotic CI, under the three suggested censoring schemes and sample sizes 50 and 100.(iv)The credible CI has higher CP for estimating under censoring scheme 1, while the asymptotic CI has higher CP than the credible CI under censoring schemes 2 and 3 (see Table 3).

The two prediction methods that are used in this paper are BUP and BP, so for the purpose of comparison between these methods, we conduct a simulation analysis. The numerical results for the predicted unobserved order statistics are reported in Tables 5 and 6. Tables 5 and 6 illustrate point and interval prediction values under different censoring schemes and sample sizes and 100.

From these tables, we notice that the predicted values of belong to the proposed confidence interval and the predicted values under BP are less than their values under BUP for censoring scheme 1, while the converse is true for censoring scheme 2. Under censoring scheme 3, no fixed rule is obtained for the prediction value comparison.

For a fixed sample size, we observe that the largest predicted value is observed when applying censoring scheme 3. One may also notice that the lower and the upper bounds of prediction interval for the missing order statistics increase as increase for each .

In order to select the suitable prediction method, one can depend on either the average interval length or the coverage percentages () of the observed intervals. This can be calculated easily from Tables 5 and 6; for example, in the case of censoring scheme 1 with sample size 50, we prefer to use BP to predict the unobserved statistics , since it has shorter CI length, but based on the , BUP is preferable. To predict , we prefer to use the BUP as it has shorter CI length and higher CP as well (more valuable results are found in Tables 5 and 6).

6. Conclusions

In this article, we used estimation of the unknown parameters of NP distribution under progressive Type-II censored sampling to assess the performance of the MLE and Bayesian estimation methods and to determine the best prediction method for predicting unobserved lifetimes. The MLE and the Bayes estimation methods were considered to observe both the point and interval estimation. Two methods of prediction for the future observation were employed, namely, the BUP and the BP. Numerical methods and simulation analysis were used for comparison between methods of estimation and methods of prediction. We concluded that MLE is better to estimate the scale parameter , while Bayes estimation is better to estimate the shape parameter . Many valuable results were found and summarized from the tables in Section 5. Researchers may develop new distributions and apply different censoring schemes to their sample data to obtain better point and interval estimation and prediction criteria for the future unobserved data, such as adaptive, hybrid progressive, and other censoring schemes.

Data Availability

The numerical data used to support the findings of this study are included within the article.

Conflicts of Interest

The author declares that there are no conflicts of interest.

Acknowledgments

The author extends her appreciation to the Deanship of Scientific Research at King Faisal University for its financial support under Nasher Track (grant no. 206196).