Abstract

Heavy-tailed distributions play a prominent role in actuarial and financial sciences. In this paper, we introduce a family of distributions that we refer to as exponential T-X (ETX) family. Based on the proposed approach, a new extension of the Weibull model is introduced. The proposed model is very flexible in modeling heavy-tailed data. Some mathematical properties are derived, and maximum likelihood estimates of the model parameters are obtained. A Monte Carlo simulation study is conducted to evaluate the performance of the maximum likelihood estimators. Actuarial measures such as value at risk and tail value at risk are also calculated. A simulation study based on these actuarial measures is provided. Finally, an application to a heavy-tailed automobile insurance claim data set is presented. The proposed model is compared with some well-known competing distributions.

1. Introduction

Broadly speaking, statistical distributions play a prominent role in modeling data in applied fields, particularly in risk management, economic, financial, and actuarial sciences. However, the quality of the procedures primarily depends upon the assumed probability model of the phenomenon under consideration. Among the applied fields, insurance losses are usually positive, right-skewed, unimodal, and with heavy tails; see the works of Lane [1], Cooray and Ananda [2], Klugman et al. [3], and Ahmad et al. [4]. The actuaries are often looking for heavy-tailed distributions to sufficiently provide a good estimate of the associated business risk level. The heavy-tailed distributions are those whose right tail probabilities are heavier than the exponential one, and they satisfy

An important characteristic of the heavy-tailed distributions is the regular variational property. A distribution is called regular varying if it obeyswhere is a so-called index of regular variation. The distributions that possess the regular variation property are very competitive models for modeling heavy-tailed data sets. For more details, the interested readers can refer to the works of McNeil [5] and Beirlant et al. [6].

Well-known probability models such as the log-normal, Pareto, gamma, beta, and Weibull distributions are very useful in modeling data in application. However, these classical distributions have certain deficiencies in modeling insurance losses. For example, (i) the log-normal, gamma, and beta distributions do not have closed-form expressions for the cumulative distribution function (cdf) and the computation of many mathematical properties becomes difficult, (ii) the Pareto distribution, due to the monotonically decreasing shape of the density, does not provide a reasonably good fit for many applications, and (iii) the Weibull distribution covers better the behavior of small losses but fails to cater the behavior of large losses [7].

Therefore, the practitioners have shown a deep interest in proposing extended versions of these existing distributions. The new developments have been constructed through many approaches such as (i) composition of two or more distributions, (ii) compounding of distributions, and (iii) a finite mixture of distributions (Scollnik and Sun [8] and Ahmad et al. [9]).

The composition of two or more distributions is a prominent approach for obtaining new flexible heavy-tailed families of distributions, which gives a reasonably good fit for heavy-tailed losses as shown by Cooray and Ananda [2], Nadarajah and Abu Bakar [10], and Abu Bakar et al. [11]. However, it should be noted that the new distributions obtained by the composition approach involve more than three parameters causing difficulties in the estimation process and computational efforts are required.

Another prominent approach is the compounding of distributions to cater the data modeling with unimodality, right-skewness, and heavy tails as illustrated by Punzo [12], Mazza and Punzo [13], Tomarchio and Punzo [14], and Punzo and Bagnato [15]. However, the density obtained via this method may not have a closed-form expression, which makes the estimation more cumbersome as shown in Punzo et al. [16]. For a brief review about compounding of distributions, we refer to the work of Tahir and Cordeiro [17].

Finite mixture models represent a further approach to define very flexible distributions which are also able to capture, for instance, multimodality of the underlying distribution as shown in the works of Bernardi et al. [18], Miljkovic and Grun [19], and Punzo et al. [20]. The price to pay for this greater flexibility is a more complicated and computationally challenging inference.

In this article, we introduce a family of distributions called exponential T-X (ETX) family of distributions. The proposed method possesses the regularly tail behavior and therefore it can be used quite effectively for modeling heavy-tailed data. Using the ETX method, we study a special model named exponential T-X Weibull (ETX-Weibull) distribution. We later prove empirically that the ETX-Weibull distribution provides better fits than the well-known competitive distributions in terms of different measures of model validation using automobile insurance claim data.

We hope that the ETX-Weibull distribution will attract wider applications in insurance losses data and financial returns, among others. The estimation of the ETX-Weibull distribution parameters using the method of maximum likelihood estimation has been carried out. Further, some actuarial measures such as value at risk (VaR) and tail value at risk (TVaR) are also calculated.

The rest of this article is organized in six sections as follows: the methodology and ETX-Weibull model are discussed in Section 2. The maximum likelihood estimators of the ETX-Weibull parameters are obtained in Section 3. A Monte Carlo simulation study is provided in Section 4. The actuarial measures of the ETX-Weibull distribution and their simulation study are derived in Section 5. A practical application to the automobile insurance claims data set is provided in Section 6. Finally, the article is concluded in the last section.

2. Methodology, Special Model, and Properties

This section deals with the methodology adopted to introduce the proposed method and a special model of the proposed model. Furthermore, the regular variation results and other mathematical properties are also derived.

2.1. Methodology

In this subsection, we introduce a new family of distributions using the T-X family approach [21]. Let be the probability density function (pdf) of a random variable, say T, where for and let be a function of cumulative distribution function (cdf) of a random variable, say X, depending on the parameter vector and satisfying the conditions given below:(1)(2) is differentiable and monotonically increasing(3) as and as

The T-X family method is defined bywhere satisfies the conditions stated above. The pdf corresponding to (3) is

For the contributed work on T-X method, we refer to the work of Ahmad et al. By taking as the pdf of the exponential distribution with rate parameter given by and replacing the upper limit of (3) with , we get the cdf of the proposed family. If a random variable X follows one of the members of our ETX family, then its cdf is given bywhere is the survival function (sf) of the baseline random variable depending on the parameter .

To the best of our knowledge, the proposed method has not been used so far. This is another motivation using the proposed approach. Hence, using the proposed method a number of new distributions can also be obtained. The probability density function (pdf) corresponding to (5) is given by

The new pdf is most tractable when and have simple analytical expressions. The basic motivations for using the ETX family of distributions in practice are the following:(1)A new prominent method of introducing an additional parameter to generate generalized versions of the baseline model rather than adding two or more parameters(2)To improve the characteristics and flexibility of the classical distributions(3)To make the kurtosis more flexible as compared to the baseline model(4)To obtain new models suitable for modeling heavy-tailed data(5)To define special models having closed forms for cdf and sf as well as thefailure rate function(6)To provide consistently better fits than other generated distributions having the same or higher number of parameters

The survival function (sf) and hazard rate function (hrf) corresponding to (5) are given, respectively, byand

Based on (5), we propose a three-parameter special submodel, called ETX-Weibull distribution. We derive explicit expressions for the VaR and TVaR of the proposed distribution. Most importantly, we provide a comprehensive simulation study of the VaR and TVaR and empirically show that the ETX-Weibull distribution is a heavy-tailed model and can be used quite effectively in the field of insurance sciences and other related areas.

2.2. Special Model

Consider the distribution and density functions of the two-parameter Weibull distribution given by and , respectively, where . Then, the cdf of the ETX-Weibull distribution is given bywith pdf

The effect of the parameter on the shapes of pdf of the proposed model are shown in Figure 1. These density plots are presented for , , and different values of . From Figure 1, it is clear that the proposed model tends to a heavy-tailed distribution as the value of increases. These plots illustrate that the additional parameter has a significant effect on the pdf behavior of the proposed model.

2.3. Mathematical Properties

In this section, some mathematical properties of the proposed family such as regularly varying tail behavior, quantile function, moments, and moment generating function are derived.

2.3.1. Regularly Varying Tail Behavior

The regularly varying tail behavior is an important characteristic to identify heavy-tailed distributions. In this subsection, we deal with the regular variational behavior of the proposed family. According to [22], in terms of sf, we have the following characterization.

Theorem 1. If is regularly varying, then so is .

Proof. Assume that is finite but nonzero for every . Using expression 7, we observe thatSince , the expression in (11) reduces towhich is finite but nonzero for every ; thus, is regularly varying.

Remark 1. By Karamata’s characterization theorem [23], the function has the form , where is the so-called index of regular variation, and .

2.3.2. Regular Variational Result

If the distribution of has power law behavior, then, according to [22], we have

Now, by Karamata’s characterization theorem [23], this means that we should be able to write aswhere is slowly varying. Note that

Since , we can writewhere . So if is slowly varying, then the variational result obtained is true. Now, according to Resnick [24], we must show that, for all ,

After some simplification, we get

If , then , and . Therefore, from expression (18), we havewhich leads to the fact that

2.3.3. The Quantile Function

The quantile function of distribution is very useful to generate random numbers by Monte Carlo simulation. The quantile function of a random variable X with cdf (5) is given bywhere t is the solution of the equation , and u. The nonlinear expression (21) can be used to obtain the random numbers from any submodel of the proposed class.

2.3.4. Moments

Suppose that X is a random variable with pdf (6); then the moment of the proposed family is derived as

Using the series, we have

Replacing x by in (23) and then using it in (22), we arrive atwhere .

For r= 1, 2, 3, 4, we get the first four moments of the ETX family. The effects of the shape parameters on the skewness and kurtosis can be detected on the moments. The moment of the ETX-Weibull distribution can be calculated using (24) as follows. Using the pdf and cdf of the two-parameter Weibull distribution (defined in Section 2.2), can be expressed aswhere represents the exponentiated Weibull (Exp-Weibull) density with the baseline Weibull (defined in Section 2.2) with power parameter . Hence, reduces towhere . Using equations (24) and (26), the r moment of the ETX-Weibull distribution reduces to

The central moment of the ETX-Weibull distribution can be expressed aswhere follows from (27) with . Hence, of the ETX-Weibull distribution takes the form

Hence, the second, third, and fourth moments, , , and , of the ETX-Weibull distribution follow simply from (29) by replacing , respectively. Based on these moments, we can obtain skewness and kurtosis measures of the ETX-Weibull distribution. The skewness and kurtosis of ETX-Weibull distribution are defined by

These measures are less sensitive to outliers. Plots for the skewness and kurtosis of the ETX-Weibull distribution are displayed in Figure 2.

Furthermore, the moment generating function, say , of the ETX family of distributions can be obtained as follows:

3. Maximum Likelihood Estimation

In the following section, we use the maximum likelihood estimation method to estimate the model parameters. Let be the observed values of a random sample of size n taken from pdf (6) with parameters and . Then, the log-likelihood function corresponding to (6) is given by

The log-likelihood function can be maximized either directly or by solving the nonlinear likelihood function obtained by differentiating (25). We used the goodness of fit function in R with “L-BFGS-B” algorithm to obtain the MLEs. The partial derivatives of (25) with respective to parameters are given, respectively, by

Solving numerically the above expressions simultaneously yields the MLEs (maximum likelihood estimators) of , respectively.

4. Monte Carlo Simulation Study

In this section, we assess the behavior of the maximum likelihood estimators for a finite sample of size n. Simulation study based on the ETX-Weibull distribution is carried out. The random numbers are generated via the quantile technique from the ETX-Weibull using optim()R-function with the argument method = ”L-BFGS-B”; see Appendix. The simulation study is based on the following steps:(1)We generate N = 1000 samples of sizes from the ETX-Weibull distribution(2)Compute the maximum likelihood estimates for the model parameters(3)Compute the MSEs and biases given byfor , respectively

The simulation results are graphically displayed in Figures 36.

From these plots, we observe the following results:(i)The estimates tend to be stable as the sample size n increases(ii)The estimated MSEs decay toward zero as n increases(iii)The absolute biases decrease as n increases(iv)As the sample size n increases, the estimated biases decrease

The numerical results presented through plots reveal the consistency property of the MLEs.

5. Actuarial Measures

One of the most important tasks of actuarial sciences institutions is to evaluate the exposure to market risk in a portfolio of instruments, which arise from changes in underlying variables such as prices of equity, interest rates, or exchange rates. In this section, we calculate VaR and TVaR for the proposed distribution.

5.1. Value at Risk

In the context of actuarial sciences, the measure VaR is widely used by practitioners as standard financial market risk. It is also known as the quantile risk measure or quantile premium principle. The VaR is always specified with a given degree of confidence, say (typically , , 99%), and represents the percentage loss in the portfolio value that will be equal to or exceed only X percent of the time. VaR of a random variable X is the quantile of its cdf; see the work of Artzner (1999). If X has cdf (6), thenwhere t is the solution of the equation .

5.2. Tail Value at Risk

Another important measure is TVaR, also known as conditional tail expectation (CTE) or tail conditional expectation (TCE), which can be adopted to quantify the expected value of the loss given that an event outside a given probability level has occurred. Let X follow the proposed family; then TVaR of X is defined as

Using (10) in (27), we have

5.3. Simulation Study of Risk Measures

In this subsection, we provide a numerical study of the risk measures for the two parameters traditional Weibull and ETX-Weibull models for different sets of parameters. The process is described as follows:(i)Random samples of size n = 150 are generated from the Weibull and ETX-Weibull models and parameters have been estimated via the maximum likelihood method(ii)1000 repetitions are made to calculate the VaR and TVaR for these distributions

A model with higher values of the risk measures (VaR and TVaR) is said to have heavier tails. The simulation results provided in Tables 1 and 2 show that the proposed model has higher values of the risk measures than the traditional Weibull distribution. In the light of the results provided in Tables 1 and 2 as well as in Figures 7 and 8, we can easily detect that ETX-Weibull distribution has heavier tails than the Weibull distribution. The simulation study of the VaR and TVaR is a prominent approach to determine the heavy-tailed distributions empirically.

In support of Table 1, the graphs for the VaR and TVaR of the Weibull and ETX-Weibull distributions are sketched in Figure 7.

In support of Table 2, the graphs for the VaR and TVaR of the Weibull and ETX-Weibull distributions are sketched in Figure 8.

6. An Application and Numerical Computation of VaR and TVaR

In this section, we illustrate the ETX-Weibull model by analyzing automobile insurance claim data to show how the proposed method works in practice. Furthermore, we calculate the actual measures of the Weibull and ETX-Weibull distributions using the real data set.

6.1. Automobile Insurance Claim Data

In this subsection, we illustrate the proposed model by analyzing a heavy-tailed real data set representing the automobile insurance claim data. The data set can be found at http://Auto_Insurance_Claims_Sample.csv. This data has also been studied by [25]. These data are used for comparison of the ETX-Weibull distribution with the other heavy-tailed distributions. For the comparison purposes, we consider some well-known (i) two-parameter models such as Lomax, Burr-XII (BX-II), gamma, and log-normal distributions, (ii) three-parameter models such as modified Weibull (MW) and Weibull-Claim (W-Claim) distributions, and (iii) a four-parameter distribution such as the new Weibull Burr-XII (NWBX-II) distribution. The distribution functions of the competitive distributions are as follows.(i)Lomax distribution:(ii)BX-II distribution:(iii)Log-normal distribution:(iv)Gamma distribution:(v)W-Claim distribution:(vi)MW distribution:(vii)NWBX-II distribution:

To decide about the goodness of fit among the proposed and other competitive distributions, we consider certain analytical measures. These measures include Akaike information criterion (AIC), Bayesian information criterion (BIC), Hannan-Quinn information criterion (HQIC), and consistent Akaike information criterion (CAIC). These measures are given as follows.(i)AIC is given by(ii)BIC is given by(iii)HQIC is given by(iv)CAIC is given bywhere denotes the log-likelihood function evaluated at the MLEs, k is the number of model parameters, and n is the sample size. The maximum likelihood estimates of the model parameters are provided in Table 3, whereas the analytical results are presented in Table 4.

From Table 4, it is clear that the proposed distribution has lower values of these measures than the other models applied in comparison. The fitted pdf and cdf plots of the proposed model for the analyzed data set are plotted in Figure 9, whereas the Kaplan-Meier survival plot of the proposed distribution and box plot of the data set are sketched in Figure 10. From Figure 9, it is clear that the proposed model fits the estimated pdf and cdf plots very closely. From Figure 10, we can easily detect that the data set has a heavy tail skewed to the right (box plot) and the ETX-Weibull distribution fits the Kaplan-Meier survival plot very closely.

6.2. Computation of VaR and TVaR Using Automobile Insurance Claim Data

In this subsection, we compute the VaR and TVaR measures of the Weibull and the ETX-Weibull distributions using the estimated values of the parameters of the data set analyzed in subsection 6.1. The numerical results are reported in Table 5.

As we have mentioned earlier, the distribution with higher values of risk measures is said to possess heavier tails. From the numerical results for the VaR and TVaR of the Weibull and ETX-Weibull distributions provided in Table 5, it is clear that the proposed distribution has a heavier tail than the Weibull distribution and can be used as a good candidate model for modeling heavy-tailed insurance data sets.

7. Concluding Remarks

In this article, a new heavy-tailed family of claim distributions is proposed. For illustrative purposes, a special model of the proposed family is considered, called ETX-Weibull distribution. The ETX-Weibull model is very flexible in modeling heavy-tailed data. Some mathematical properties of the proposed family are derived, and maximum likelihood estimators of the model parameters are obtained. A comprehensive simulation study is presented to explore the behavior of these estimators. Actuarial measures of the ETX-Weibull model are also calculated, and a simulation study is conducted to show the usefulness of the proposed method in actuarial sciences. The simulation study of the actuarial measures shows that the proposed model possesses heavier tails. Finally, an automobile insurance claim data set is analyzed, and the comparison of the proposed model is made with other well-known competitors. The application shows that the ETX-Weibull distribution may be a good candidate for modeling heavy-tailed insurance data sets.

Appendix

R Code for the Simulation Study

In the following R-code, a is used for , s is used for , and g is used for γ.###############################################################library(rootSolve)library(AdequacyModel)###################################################################### Genertng Random Sample of Size n from the {ETX-Weibull}###############################################################rNHT_Weibull=function(par,n){a=par[1]; s=par[2]; g=paru=runif(n)x=c()for (i in 1:n){f = function(x) (1-exp(-gx^a))(s-1)-u[i](s-(1-exp(-gx^a)))x[i] = rootSolve::uniroot.all(f , interval = c(0,100000))}return(x)}###################################################################### The pdf of the {ETX-Weibull}###############################################################dNHT_Weibull <- function(par,x){a= pars= parg= par(ags(s-1)(x^(a-1))exp(-gx^a)/((s-1+exp(-g∗x^a))^2))}###################################################################### The cdf of the {ETX-Weibull}###############################################################pNHT_Weibull<- function(par,x){a= pars= parg= par1-((sexp(-gx^a))/(s-1+exp(-gx^a)))}###################################################################### The Log likelihodd function of the {ETX-Weibull}###############################################################loglikelihoodNHT_Weibull <- function(par){a= pars= parg= paraux=(s-1+exp(-gx^a))if(a>0 && s>1 && g>0 && min(x)>0 && min(aux)>0){w = log(a)+log(s)+log(g)+log(s-1)+(a-1)log(x)-(gx^a)-2log(aux)return(sum(w))}else{return(-9999999.9)}}###################################################################### The Function for the simulation stuyd of the {ETX-Weibull}###############################################################par<-c(0.9, 1.2, 0.5); a=0.9; s=1.2; g=0.5n_replicas = 1000matriz_par <- matrix(0,40,3)matriz_bias<- matrix(0,40,3)matriz_MSE <- matrix(0,40,3)matriz_std <- matrix(0,40,3)colnames(matriz_par) <- c(“a”, “s”, “g”)colnames(matriz_bias)<- c(“a”, “s”, “g”)colnames(matriz_MSE) <- c(“a”, “s”, “g”)colnames(matriz_std) <- c(“a”, “s”, “g”)cont = 1n = 25while(n <= 1000){par_mean<-c(0,0,0)std_mean<-c(0,0,0)bias<-c(0,0,0)MSE<-c(0,0,0)replica = 1while(replica <= n_replicas){print(paste(“n = ”,n, “, replica = ”, replica))x<-rNHT_Weibull(par,n)Data<-x######################################################################### Optimization and Generating the Simulation Results.###############################################################result=optim(c(a, s, g), loglikelihoodNHT_Weibull, hessian = F,control = list(fnscale = -1),method = “L-BFGS-B”, lower = c(0.001,1.001, 0.001), upper = c(5,5,5))if (class(result) != “try-error” && result\$convergence == 0){par_{m}ean \lt - par_{m}ean + result\$parbias = bias + (result\$par - par)MSE = MSE + (resul\t$par - par)^2replica = replica +1}}par_mean = par_mean/n_replicasbias = bias/n_replicasMSE = MSE/n_replicasmatriz_par[cont,] = par_meanmatriz_std[cont,] = std_meanmatriz_bias[cont,] = biasmatriz_MSE[cont,] = MSEprint(“mean = ”)print(par_mean )print(“bias = ”)print(bias )print(“MSE = ”)print(MSE )n = n + 25cont = cont +1}print(matriz_par)print(matriz_MSE)print(matriz_bias)n=seq(25,1000,25)######################################################################### Plots of Parameters###############################################################plot(n,(matriz_par[,1]), type=“o”, col=“green”, lty=1, lwd=2,xlab=“n”,ylab=“Estimated Parameters”,ylim=c(0,3))lines(n,(matriz_par[,2]), col=“blue”, lty=5,lwd=2,type=“o”)lines(n,(matriz_par[,3]), col=“red”, lty=8,lwd=2,type=“o”)title(“Plot of Estimated Parameters vs n”)legend(700,2.8, legend = c(expression(paste(alpha,“=”,“0.9”)),expression(paste(sigma,“=”,“1.2”)),expression(paste(gamma,“=”,“0.5”))),lty =c(1,5,8),cex=1, col=c(’green’,’blue’,’red’))######################################################################### Plots of MSEs###############################################################plot(n,matriz_MSE[,1], col=“green”, lty=1, lwd=2,type=“o”, xlab=“n”,ylab=“MSE”, ylim=c(0,6))lines(n,matriz_MSE[,2], col=“blue”, lty=5, lwd=2,type=“o”)lines(n,matriz_MSE[,3], col=“red”, lty=8,lwd=2,type=“o”)title(“Plot of MSE vs n”)legend(700,5.5, legend = c(expression(paste(alpha,“=”,“0.9”)),expression(paste(sigma,“=”,“1.2”)),expression(paste(gamma,“=”,“0.5”))),lty =c(1,5,8),cex=1, col=c(’green’,’blue’,’red’))######################################################################### Plots of Absolute Biases###############################################################plot(n,abs(matriz_bias[,1]), type=“o”, col=“green”, lty=1, lwd=2,xlab=“n”,ylab=“Absolute Bias”,ylim=c(0,1.5))lines(n,abs(matriz_bias[,2]), col=“blue”, lty=2, lwd=2,type=“o”)lines(n,abs(matriz_bias[,3]), col=“red”, lty=3,lwd=2,type=“o”)title(“Plot of Absolute Bias vs n”)legend(700,1.4, legend = c(expression(paste(alpha,“=”,“0.9”)),expression(paste(sigma,“=”,“1.2”)),expression(paste(gamma,“=”,“0.5”))),lty =c(1,5,8),cex=1, col=c(’green’,’blue’,’red’))######################################################################### Plots of Absolute Biases###############################################################plot(n,(matriz_bias[,1]), type=“o”, col=“green”, lty=1, lwd=2, xlab=“n”,ylab=“Bias”,ylim=c(0,1.5))lines(n,(matriz_bias[,2]), col=“blue”, lty=5,lwd=2,type=“o”)lines(n,(matriz_bias[,3]), col=“red”, lty=8, lwd=2,type=“o”)title(“Plot of Bias vs n”)legend(700,1.4, legend = c(expression(paste(alpha,“=”,“0.9”)),expression(paste(sigma,“=”,“1.2”)),expression(paste(gamma,“=”,“0.5”))),lty =c(1,5,8),cex=1, col=c(’green’,’blue’,’red’))

Data Availability

This work is mainly a methodological development and has been applied on secondary data related to the insurance sciences, but, if required, data will be provided.

Disclosure

This article is drafted from the Ph.D. work of the first author (Zubair Ahmad).

Conflicts of Interest

The authors declare that they have no conflicts of interest.