Abstract
We propose a new procedure to predict the loss given default (LGD) distribution. Studies find empirical evidence that LGD values have a high concentration at the endpoint 0. Thus, we first use a logistic regression to determine the probability that the LGD value of a defaulted debt equals zero. Further, studies find empirical evidence that positive LGD values have a low concentration at the endpoint 1 and a bimodal distribution on the interval (0,1). Therefore, we use a right-tailed censored beta-mixture regression to model the distribution of positive LGD data. To implement the proposed procedure, we collect 5554 defaulted debts from Moody’s Default and Recovery Database and apply an expectation–maximization algorithm to estimate the LGD distribution. Using each of the k-fold cross-validation technique and the expanding rolling window approach, our empirical results confirm that the new procedure has better and more robust out-of-sample performance than its alternatives because it yields more accurate predictions of the LGD distribution.
Similar content being viewed by others
Notes
For studying the LGD distribution, Sigrist and Stahel (2011), Bellotti and Crook (2012), and Duan and Hwang (2016) solely use the gamma, normal, and beta distributions as a driver for the LGD distribution, respectively. Panel (a) of Fig. 1 in our data analysis section shows that the sample density function of the LGD values under study has a complicated appearance. This result indicates that using a single density function to model the LGD distribution may result in a decline in model fit.
For studying the LGD, there are other approaches. However, they suffer from different problems. For example, the LGD distribution prediction based on the inverse Gaussian regression (Qi and Zhao 2011) has zero probability masses at the endpoints 0 and 1. The Gaussian mixture model (Altman and Kalotay 2014) produces a bunch of transformed LGD values at a large negative or positive value. Thus, it is difficult to model the LGD distribution using a Gaussian mixture without facing distributional degeneracy. The ordered probit model (Hwang et al. 2016) and the ordered logistic regression (Li et al. 2016) suffer from the case that some partition cells have small sizes and the resulting parameter estimates may become less precise. Finally, the fractional response regression, regression tree, neural network, support vector machine, and pointwise logistic model impose no distributional assumption on the LGD data (Bastos 2010; Loterman et al. 2012; Hartmann-Wendels et al. 2014; Hwang and Chu 2018).
The IBR and IMBR use the logistic regression to determine the probability that the LGD value of a defaulted debt falls into each of the three categories {0}, (0,1), and {1}. By the LGD data in our data analysis section, the sizes of the three categories are about 36.80%, 57.69%, and 5.51% of the size of the entire sample, respectively. The category {1} is of much smaller size than the other two categories. In this case, the logistic regression implemented by each of IBR and IMBR may suffer from the imbalanced data problem or the rare events problem (Hwang et al. 2010; Maalouf and Siddiqi 2014) since it tends to be biased towards the majority class, and thus may underestimate the probability of rare events. To avoid this potential problem, we use the logistic regression to determine the probability that the LGD value of a defaulted debt falls into each of the two categories {0} and (0,1]. Thomas et al. (2012) and Tong et al. (2013) have applied this two-category partitioning strategy to study the LGD.
The beta distribution has various interesting shapes including the symmetric U or bell shape and the unsymmetrical J or L shape that depend on the values of its shape parameters. Thus, the beta-mixture distribution allows more flexibility in modeling the bimodal distribution for positive LGD data.
In general, maximizing log-likelihoods of the form arising in mixture models is infeasible using standard methods (Kalotay and Altman 2017).
Unal et al. (2003) propose an approach to estimate the risk-neutral density of recovery rates in default. The recovery rate is the difference between one and the LGD.
Through a straightforward calculation, RMSERWSD, out has a decomposition of \( RMS{E}_{RWSD, out}=\sqrt{\mu^2+{s}^2}. \) Here \( \mu ={m}^{-1}{\sum}_{i=1}^m RWS{D}_{out,i} \) and \( {s}^2={m}^{-1}{\sum}_{i=1}^m{\left( RWS{D}_{out,i}-\mu \right)}^2 \) are the average and variance of the quantities RWSDout, i, for i = 1, ⋯, m. With this result, the metric RMSE combines the average and variance of the given measures. Thus, it can measure the performance of the LGD distribution model over multiple samples.
Among the 5554 defaulted debts in our entire sample, there are 1963 (81) defaulted debts that have Moody’s recommended discounted recovery rates equal to one (greater than one).
Altman and Kalotay (2014) have observed similar results for the truncated recovery rates.
To produce standard errors of maximum likelihood parameters estimates for each given model, we first compute the numerical Hessian matrix \( {\left.\hat{\sum}=\left({\partial}^2/\partial {\theta}^T\partial \theta \right)\ell \left(\theta \right)\right|}_{\theta =\hat{\theta}}. \) Here ℓ(θ) denotes the log-likelihood function of the estimation sample based on the given model, θ is the parameter vector of the model, and \( \hat{\theta} \) is the maximum likelihood estimate of θ. We provide the formulas for ℓ(θ) in subsections 2.1–2.3 and 3.1 for the considered models. In this paper, we use the command hessp in the software GAUSS to generate the numerical Hessian matrix \( \hat{\sum} \) for each model. Then, for the given model, the standard error of the maximum likelihood estimate in the ith component of \( \hat{\theta} \) is taken as the square root of the ith component of the diagonal vector of the associated matrix \( {\left(-\hat{\sum}\right)}^{-1}; \) see subsection 4.2.2 of Serfling (1980).
When performing the k-fold cross-validation procedure, Kalotay and Altman (2017) use a different approach to partition the entire sample into k subsamples. They randomly assign each observation into one of the k subsamples. In this case, the sizes for their k subsamples can not be determined by the experimenter in advance.
In this paper, we use the constrained optimization procedure co in the software GAUSS to find the maximizer for each of the weighted log-likelihood functions \( {\tilde{\ell}}_1\left({\theta}_1\right), \) \( {\tilde{\ell}}_2\left({\theta}_2\right), \) and \( {\tilde{\ell}}_3\left({\eta}_1\right). \) When doing it for \( {\tilde{\ell}}_2\left({\theta}_2\right), \) we provide the procedure with the formulas of \( \nabla =\left(\partial /\partial {\theta}_2\right){\tilde{\ell}}_2\left({\theta}_2\right) \) and \( \sum =\left({\partial}^2/\partial {\theta}_2^T\partial {\theta}_2\right){\tilde{\ell}}_2\left({\theta}_2\right) \) (Ferrari and Pinheiro 2011) to compute the gradient vector and Hessian matrix that are required in the Newton algorithm, respectively. This procedure ensures that the Hessian matrix is positive definite. The same condition also applies to \( {\tilde{\ell}}_3\left({\eta}_1\right). \) But for \( {\tilde{\ell}}_1\left({\theta}_1\right), \) we are unable to provide the procedure with those formulas since they depend on the integral term Ω{(1 + ρ)−1; α1(x), β1(x)}, and thus the associated gradient vector and Hessian matrix are replaced by the procedure with their numerical substitutes. If the numerical Hessian matrix is not invertible, we restart the procedure for finding the maximizer of \( {\tilde{\ell}}_1\left({\theta}_1\right) \) with a different initial vector of θ1.
For estimating the Gaussian mixture model, the EM algorithm in Kalotay and Altman (2017) is stopped when the mean absolute difference (MAD) between successive sets of parameter estimates is less than or equal to 0.005. We set \( {\overline{\psi}}^{\ast }={d}^{-1}\left(|{\psi}_1|+\cdots +|{\psi}_d|\right) \) and \( \left\Vert \psi \right\Vert ={\left({\psi}_1^2+\cdots +{\psi}_d^2\right)}^{1/2} \) as the MAD and the Euclidean length (EL) of the d-dimensional vector θ(k + 1) − θ(k) ≡ ψ = (ψ1, ⋯, ψd), respectively, where θ(k) and θ(k + 1) are two successive sets of parameter estimates. By the relation \( {\overline{\psi}}^{\ast}\le \left\Vert \psi \right\Vert, \) if a solution satisfies the EL stopping rule ‖ψ‖ < r, then it also satisfies the MAD stopping rule \( {\overline{\psi}}^{\ast }<r, \) where r is a given convergence tolerance.
References
Acharya VV, Bharath ST, Srinivasan A (2007) Does industrywide distress affect defaulted firms? Evidence from creditor recoveries. J Financ Econ 85:787–821
Agresti A (2002) Categorical data analysis. Wiley, New York
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automatic Control 19:716–723
Altman EI (2014) Distressed and defaulted bond investment returns outperformed common stocks and high-yield bonds over last 10 years. http://www.mvis-indices.com/mvis-onehund red/distressed-and-defaulted-bonds-portfolios-continued-outperformance-in-2014. Accessed on 29 July 2018
Altman EI, Kalotay EA (2014) Ultimate recovery mixtures. J Bank Financ 40:116–129
Altman EI, Kishore VM (1996) Almost everything you wanted to know about recoveries on defaulted bonds. Financ Anal J 52:57–64
Altman EI, Brady B, Resti A, Sironi A (2005) The link between default and recovery rates: theory, empirical evidence, and implications. J Business 78:2203–2228
Bastos JA (2010) Forecasting bank loans loss-given-default. J Bank Financ 34:2510–2517
Bastos JA (2014) Ensemble predictions of recovery rates. J Financ Serv Res 46:177–193
Bellotti T, Crook J (2012) Loss given default models incorporating macroeconomic variables for credit cards. Int J Forecast 28:171–182
Calabrese R (2014) Predicting bank loan recovery rates with mixed continuous-discrete model. Appl Stoch Model Bus Ind 30:99–114
Calabrese R, Zenga M (2010) Bank loan recovery rates: measuring and nonparametric density estimation. J Bank Financ 34:903–911
Caselli S, Gatti S, Querci F (2008) The sensitivity of the loss given default rate to systematic risk: new empirical evidence on bank loans. J Financ Serv Res 34:1–34
Chava S, Stefanescu C, Turnbull S (2011) Modeling the loss distribution. Manag Sci 57:1267–1287
Chu CK, Hwang RC (2019) Predicting loss distributions for small-size defaulted-debt portfolios using a convolution technique that allows probability masses to occur at boundary points. J Financ Serv Res 56:95–117
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38
Dermine J, de Carvalho CN (2006) Bank loan losses-given-default: a case study. J Bank Financ 30:1219–1243
Duan JC, Hwang RC (2016) Predicting recovery rate at the time of corporate default. http://www.rmi.nus.edu.sg/duanjc. Accessed on 29 July 2018
Duffie D, Gârleanu N (2001) Risk and valuation of collateralized debt obligations. Fin Anal J 57:41–59
Ferrari S, Pinheiro EC (2011) Improved likelihood inference in beta regression. J Stat Comput Simul 81:431–443
Friedman CA, Sandow S (2005) Estimating conditional probability distributions of recovery rates: a utility-based approach. In: Altman EI, Resti A, Sironi A (eds) Recovery risk: the next challenge in credit risk management. Risk Books, London, pp 347–360
Frye J (2000) Collateral damage. Risk 13:91–94
Hannan EJ, Quinn BG (1979) The determination of the order of an autoregression. J R Stat Soc Ser B 41:190–195
Harris MN, Zhao X (2007) A zero-inflated ordered probit model, with an application to modelling tobacco consumption. J Econom 141:1073–1099
Hartmann-Wendels T, Miller P, Tows E (2014) Loss given default for leasing: parametric and nonparametric estimations. J Bank Financ 40:364–375
Hillegeist SA, Keating EK, Cram DP, Lundstedt KG (2004) Assessing the probability of bankruptcy. Rev Acc Stud 9:5–34
Hwang RC (2012) A varying-coefficient default model. Int J Forecast 28:675–688
Hwang RC, Chu CK (2018) A logistic regression point of view toward loss given default distribution estimation. Quan Financ 18:419–435
Hwang RC, Chung H, Chu CK (2010) Predicting issuer credit ratings using a semiparametric method. J Emp Financ 17:120–137
Hwang RC, Chung H, Chu CK (2016) A two-stage probit model for predicting recovery rates. J Financ Serv Res 50:311–339
Jarrow R, Lando D, Yu F (2005) Default risk and diversification: theory and empirical implications. Math Financ 15:1–26
Kalotay EA, Altman EI (2017) Intertemporal forecasts of defaulted bond recoveries and portfolio losses. Rev Financ 21:433–463
Keisman D, Van de Castle K (1999) Recovering your money: insights into losses from defaults. Stan Poor’s Credit Week 16:29–34
Kibria BMG, Månsson K, Shukur G (2013) Some ridge regression estimators for the zero-inflated Poisson model. J Appl Stat 40:721–735
Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14
Lando D, Nielsen MS (2010) Correlation in corporate defaults: contagion or conditional independence? J Financ Intermediation 19:355–372
Li P, Qi M, Zhang X, Zhao X (2016) Further investigation of parametric loss given default modeling. J Credit Risk 12:17–47
Lin TH, Tsai MH (2013) Modeling health survey data with excessive zero and K responses. Stat Med 32:1572–1583
Loterman G, Brown I, Martens D, Mues C, Baesens B (2012) Benchmarking regression algorithms for loss given default modeling. Int J Forecast 28:161–170
Maalouf M, Siddiqi M (2014) Weighted logistic regression for large-scale imbalanced and rare events data. Knowl-Based Syst 59:142–148
Oliveira MR, Louzada F, Pereira GHA, Moreira F, Calabrese R (2015) Inflated mixture models: applications to multimodality in loss given default. http://ssrn.com/abstract=2634919. Accessed on 29 July 2018
Oliveira MR, Moreira F, Louzada F (2017) The zero-inflated promotion cure rate model applied to financial data on time-to-default. Cogent Econ Financ 5:1395950
Ospina R, Ferrari SLP (2010) Inflated beta distributions. Stat Papers 51:111–126
Qi M, Zhao X (2011) Comparison of modeling methods for loss given default. J Bank Financ 35:2842–2855
Resti Y, Ismail N, Jamaan SH (2013) Estimation of claim cost data using zero adjusted gamma and inverse Gaussian regression models. J Math Stat 9:186–192
Rose CE, Martin SW, Wannemuehler KA, Plikaytis BD (2006) On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data. J Biopharm Stat 16:463–481
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Serfling R (1980) Approximation theorems of mathematical statistics. Wiley, New York
Siao JS, Hwang RC, Chu CK (2016) Predicting recovery rates using logistic quantile regression with bounded outcomes. Quan Financ 16:777–792
Sigrist F, Stahel WA (2011) Using the censored gamma distribution for modeling fractional response variables with an application to loss given default. ASTIN Bulletin 41:673–710
Thomas LC, Matuszyk A, Moore A (2012) Comparing debt characteristics and LGD models for different collections policies. Int J Forecast 28:196–203
Tong ENC, Mues C, Thomas L (2013) A zero-adjusted gamma model for mortgage loan loss given default. Int J Forecast 29:548–562
Unal H, Madan D, Güntay L (2003) Pricing the risk of recovery in default with absolute priority rule violation. J Bank Financ 27:1001–1025
Yang S, Harlow LL, Puggioni G, Redding CA (2017) A comparison of different methods of zero-inflated data analysis and an application in health surveys. J Mod App Stat Meth 16:518–543
Yashkir O, Yashkir Y (2013) Loss given default modeling: a comparative analysis. J Risk Model Valid 7:25–59
Acknowledgements
The authors thank the reviewers and the editor for their valuable comments and suggestions that have greatly improved the presentation of this paper. The Ministry of Science and Technology of Taiwan provides support for this research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
A sketch of the proof
To decompose ℓZCBR(δ0, 1, ρ1, α1, 1, β1, 1, α2, 1, β2, 1, η1), we first replace its components pZCBR, 0(xi) and pZCBR, 1(xi) with δ0(xi) and η{1 − δ0(xi)}[1 − Ω{(1 + ρ)−1; α1(xi), β1(xi)}], respectively, and rearrange the result as:
where
Then, through a straightforward calculation, the quantities L1 and L2 become:
The proof for the decomposition of ℓZCBR(δ0, 1, ρ1, α1, 1, β1, 1, α2, 1, β2, 1, η1) is complete.
An EM algorithm
To describe the EM algorithm for finding the maximizer of ℓZCBR, 2(ρ1, α1, 1, β1, 1, α2, 1, β2, 1, η1), we set the following notation. Let
where ρ = ln{1 + exp(ρ1)}, θ1 = (ρ1, α1, 1, β1, 1), and θ2 = (α2, 1, β2, 1). Using these results, the conditional mixture distribution of the positive LGD data is:
where y ∈ (0, 1], \( \eta =\frac{\exp \left({\eta}_1\right)}{1+\exp \left({\eta}_1\right)}, \) and θ = (θ1, θ2, η1) = (ρ1, α1, 1, β1, 1, α2, 1, β2, 1, η1).
To apply the EM algorithm to maximize \( {\ell}_{ZCBR,2}\left(\theta \right)={\sum}_{i=1,{y}_i>0}^n\ln \left\{{f}_M\left({y}_i|{x}_i,\theta \right)\right\}, \) we introduce a set of dummy latent data zi that are indicator variables linking observations yi > 0 to the mixture components fM, 1(yi| xi, θ1) and fM, 2(yi| xi, θ2). Thus, we write the log-likelihood function of the observed and latent data as:
where
The EM algorithm for maximizing \( {\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right) \) comprises the following two-step procedure. The first step of the algorithm involves making an expectation:
Here \( {\theta}^{(k)}=\left\{{\theta}_1^{(k)},{\theta}_2^{(k)},{\eta}_1^{(k)}\right\} \) and \( {\tilde{\ell}}_1\left({\theta}_1\right), \) \( {\tilde{\ell}}_2\left({\theta}_2\right), \) and \( {\tilde{\ell}}_3\left({\eta}_1\right) \) are ℓ1(θ1), ℓ2(θ2), and ℓ3(η1) with zi replaced by \( {\tilde{z}}_i. \) The quantity \( {\tilde{z}}_i \) has the formula:
for each i = 1, ⋯, n, where \( {\eta}^{(k)}=\frac{\exp \left\{{\eta}_1^{(k)}\right\}}{1+\exp \left\{{\eta}_1^{(k)}\right\}}. \) The second step of the algorithm involves maximizing \( E\left\{{\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right)|{\theta}^{(k)}\right\} \) to obtain \( {\theta}^{\left(k+1\right)}=\arg {\max}_{\theta }E\left\{{\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right)|{\theta}^{(k)}\right\}. \) It is separately performed by finding \( {\theta}_1^{\left(k+1\right)}=\arg {\max}_{\theta_1}{\tilde{\ell}}_1\left({\theta}_1\right), \) \( {\theta}_2^{\left(k+1\right)}=\arg {\max}_{\theta_2}{\tilde{\ell}}_2\left({\theta}_2\right), \) and \( {\eta}_1^{\left(k+1\right)}=\arg {\max}_{\eta_1}{\tilde{\ell}}_3\left({\eta}_1\right). \)Footnote 17 Thus, \( {\theta}^{\left(k+1\right)}=\left\{{\theta}_1^{\left(k+1\right)},{\theta}_2^{\left(k+1\right)},{\eta}_1^{\left(k+1\right)}\right\}. \) The two-step procedure continues until the convergence criterion ‖θ(k + 1) − θ(k)‖ < 0.005 is satisfied.Footnote 18 The notation ‖ψ‖ denotes the Euclidean length of the given vector ψ.
Rights and permissions
About this article
Cite this article
Hwang, RC., Chu, CK. & Yu, K. Predicting the Loss Given Default Distribution with the Zero-Inflated Censored Beta-Mixture Regression that Allows Probability Masses and Bimodality. J Financ Serv Res 59, 143–172 (2021). https://doi.org/10.1007/s10693-020-00333-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10693-020-00333-w
Keywords
- Conditional distribution
- Expectation–maximization algorithm
- Logistic regression
- Loss given default
- Right-tailed censored beta-mixture regression
- Zero-inflated model