Predicting the Loss Given Default Distribution with the Zero-Inflated Censored Beta-Mixture Regression that Allows Probability Masses and Bimodality

Hwang, Ruey-Ching; Chu, Chih-Kang; Yu, Kaizhi

doi:10.1007/s10693-020-00333-w

Predicting the Loss Given Default Distribution with the Zero-Inflated Censored Beta-Mixture Regression that Allows Probability Masses and Bimodality

Published: 18 March 2020

Volume 59, pages 143–172, (2021)
Cite this article

Journal of Financial Services Research Aims and scope Submit manuscript

517 Accesses
1 Citation
Explore all metrics

Abstract

We propose a new procedure to predict the loss given default (LGD) distribution. Studies find empirical evidence that LGD values have a high concentration at the endpoint 0. Thus, we first use a logistic regression to determine the probability that the LGD value of a defaulted debt equals zero. Further, studies find empirical evidence that positive LGD values have a low concentration at the endpoint 1 and a bimodal distribution on the interval (0,1). Therefore, we use a right-tailed censored beta-mixture regression to model the distribution of positive LGD data. To implement the proposed procedure, we collect 5554 defaulted debts from Moody’s Default and Recovery Database and apply an expectation–maximization algorithm to estimate the LGD distribution. Using each of the k-fold cross-validation technique and the expanding rolling window approach, our empirical results confirm that the new procedure has better and more robust out-of-sample performance than its alternatives because it yields more accurate predictions of the LGD distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning-driven credit risk: a systemic review

Article Open access 16 July 2022

Si Shi, Rita Tse, … Giovanni Pau

Algorithmic discrimination in the credit domain: what do we know about it?

Article Open access 17 May 2023

Ana Cristina Bicharra Garcia, Marcio Gomes Pinto Garcia & Roberto Rigobon

Forecasting gold price with the XGBoost algorithm and SHAP interaction values

Article 23 July 2021

Sami Ben Jabeur, Salma Mefteh-Wali & Jean-Laurent Viviani

Notes

For studying the LGD distribution, Sigrist and Stahel (2011), Bellotti and Crook (2012), and Duan and Hwang (2016) solely use the gamma, normal, and beta distributions as a driver for the LGD distribution, respectively. Panel (a) of Fig. 1 in our data analysis section shows that the sample density function of the LGD values under study has a complicated appearance. This result indicates that using a single density function to model the LGD distribution may result in a decline in model fit.
For studying the LGD, there are other approaches. However, they suffer from different problems. For example, the LGD distribution prediction based on the inverse Gaussian regression (Qi and Zhao 2011) has zero probability masses at the endpoints 0 and 1. The Gaussian mixture model (Altman and Kalotay 2014) produces a bunch of transformed LGD values at a large negative or positive value. Thus, it is difficult to model the LGD distribution using a Gaussian mixture without facing distributional degeneracy. The ordered probit model (Hwang et al. 2016) and the ordered logistic regression (Li et al. 2016) suffer from the case that some partition cells have small sizes and the resulting parameter estimates may become less precise. Finally, the fractional response regression, regression tree, neural network, support vector machine, and pointwise logistic model impose no distributional assumption on the LGD data (Bastos 2010; Loterman et al. 2012; Hartmann-Wendels et al. 2014; Hwang and Chu 2018).
For applications of the zero-inflated model in the fields of economics, finance, industry, insurance, and medicine, see for example, Rose et al. (2006), Harris and Zhao (2007), Kibria et al. (2013), Lin and Tsai (2013), Resti et al. (2013), Oliveira et al. (2017), and Yang et al. (2017).
For studying defaults, Duffie and Gârleanu (2001), Jarrow et al. (2005), and Lando and Nielsen (2010) have used the idea of conditional independence.
The IBR and IMBR use the logistic regression to determine the probability that the LGD value of a defaulted debt falls into each of the three categories {0}, (0,1), and {1}. By the LGD data in our data analysis section, the sizes of the three categories are about 36.80%, 57.69%, and 5.51% of the size of the entire sample, respectively. The category {1} is of much smaller size than the other two categories. In this case, the logistic regression implemented by each of IBR and IMBR may suffer from the imbalanced data problem or the rare events problem (Hwang et al. 2010; Maalouf and Siddiqi 2014) since it tends to be biased towards the majority class, and thus may underestimate the probability of rare events. To avoid this potential problem, we use the logistic regression to determine the probability that the LGD value of a defaulted debt falls into each of the two categories {0} and (0,1]. Thomas et al. (2012) and Tong et al. (2013) have applied this two-category partitioning strategy to study the LGD.
The beta distribution has various interesting shapes including the symmetric U or bell shape and the unsymmetrical J or L shape that depend on the values of its shape parameters. Thus, the beta-mixture distribution allows more flexibility in modeling the bimodal distribution for positive LGD data.
In general, maximizing log-likelihoods of the form arising in mixture models is infeasible using standard methods (Kalotay and Altman 2017).
Unal et al. (2003) propose an approach to estimate the risk-neutral density of recovery rates in default. The recovery rate is the difference between one and the LGD.
Through a straightforward calculation, RMSE_{RWSD, out} has a decomposition of $ RMS{E}_{RWSD, out}=\sqrt{\mu^2+{s}^2}. $ Here $ \mu ={m}^{-1}{\sum}_{i=1}^m RWS{D}_{out,i} $ and $ {s}^2={m}^{-1}{\sum}_{i=1}^m{\left( RWS{D}_{out,i}-\mu \right)}^2 $ are the average and variance of the quantities RWSD_{out, i}, for i = 1, ⋯, m. With this result, the metric RMSE combines the average and variance of the given measures. Thus, it can measure the performance of the LGD distribution model over multiple samples.
For presenting the LGD frequency distribution, Yashkir and Yashkir (2013) and Calabrese (2014) use the value of q as q = 20, Bellotti and Crook (2012) q = 30, and Oliveira et al. (2015) q = 50.
Chava et al. (2011), Qi and Zhao (2011), Yashkir and Yashkir (2013), and Altman and Kalotay (2014) have used this truncation approach in calculating the recovery rate.
Chu and Hwang (2019) have observed similar characteristics to those shown in Panels A–H of Table 2.
Among the 5554 defaulted debts in our entire sample, there are 1963 (81) defaulted debts that have Moody’s recommended discounted recovery rates equal to one (greater than one).
Altman and Kalotay (2014) have observed similar results for the truncated recovery rates.
To produce standard errors of maximum likelihood parameters estimates for each given model, we first compute the numerical Hessian matrix $ {\left.\hat{\sum}=\left({\partial}^2/\partial {\theta}^T\partial \theta \right)\ell \left(\theta \right)\right|}_{\theta =\hat{\theta}}. $ Here ℓ(θ) denotes the log-likelihood function of the estimation sample based on the given model, θ is the parameter vector of the model, and $ \hat{\theta} $ is the maximum likelihood estimate of θ. We provide the formulas for ℓ(θ) in subsections 2.1–2.3 and 3.1 for the considered models. In this paper, we use the command hessp in the software GAUSS to generate the numerical Hessian matrix $ \hat{\sum} $ for each model. Then, for the given model, the standard error of the maximum likelihood estimate in the ith component of $ \hat{\theta} $ is taken as the square root of the ith component of the diagonal vector of the associated matrix $ {\left(-\hat{\sum}\right)}^{-1}; $ see subsection 4.2.2 of Serfling (1980).
When performing the k-fold cross-validation procedure, Kalotay and Altman (2017) use a different approach to partition the entire sample into k subsamples. They randomly assign each observation into one of the k subsamples. In this case, the sizes for their k subsamples can not be determined by the experimenter in advance.
In this paper, we use the constrained optimization procedure co in the software GAUSS to find the maximizer for each of the weighted log-likelihood functions $ {\tilde{\ell}}_1\left({\theta}_1\right), $ $ {\tilde{\ell}}_2\left({\theta}_2\right), $ and $ {\tilde{\ell}}_3\left({\eta}_1\right). $ When doing it for $ {\tilde{\ell}}_2\left({\theta}_2\right), $ we provide the procedure with the formulas of $ \nabla =\left(\partial /\partial {\theta}_2\right){\tilde{\ell}}_2\left({\theta}_2\right) $ and $ \sum =\left({\partial}^2/\partial {\theta}_2^T\partial {\theta}_2\right){\tilde{\ell}}_2\left({\theta}_2\right) $ (Ferrari and Pinheiro 2011) to compute the gradient vector and Hessian matrix that are required in the Newton algorithm, respectively. This procedure ensures that the Hessian matrix is positive definite. The same condition also applies to $ {\tilde{\ell}}_3\left({\eta}_1\right). $ But for $ {\tilde{\ell}}_1\left({\theta}_1\right), $ we are unable to provide the procedure with those formulas since they depend on the integral term Ω{(1 + ρ)⁻¹; α₁(x), β₁(x)}, and thus the associated gradient vector and Hessian matrix are replaced by the procedure with their numerical substitutes. If the numerical Hessian matrix is not invertible, we restart the procedure for finding the maximizer of $ {\tilde{\ell}}_1\left({\theta}_1\right) $ with a different initial vector of θ₁.
For estimating the Gaussian mixture model, the EM algorithm in Kalotay and Altman (2017) is stopped when the mean absolute difference (MAD) between successive sets of parameter estimates is less than or equal to 0.005. We set $ {\overline{\psi}}^{\ast }={d}^{-1}\left(|{\psi}_1|+\cdots +|{\psi}_d|\right) $ and $ \left\Vert \psi \right\Vert ={\left({\psi}_1^2+\cdots +{\psi}_d^2\right)}^{1/2} $ as the MAD and the Euclidean length (EL) of the d-dimensional vector θ^(k + 1) − θ^(k) ≡ ψ = (ψ₁, ⋯, ψ_d), respectively, where θ^(k) and θ^(k + 1) are two successive sets of parameter estimates. By the relation $ {\overline{\psi}}^{\ast}\le \left\Vert \psi \right\Vert, $ if a solution satisfies the EL stopping rule ‖ψ‖ < r, then it also satisfies the MAD stopping rule $ {\overline{\psi}}^{\ast }<r, $ where r is a given convergence tolerance.

References

Acharya VV, Bharath ST, Srinivasan A (2007) Does industrywide distress affect defaulted firms? Evidence from creditor recoveries. J Financ Econ 85:787–821
Google Scholar
Agresti A (2002) Categorical data analysis. Wiley, New York
Google Scholar
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automatic Control 19:716–723
Google Scholar
Altman EI (2014) Distressed and defaulted bond investment returns outperformed common stocks and high-yield bonds over last 10 years. http://www.mvis-indices.com/mvis-onehund red/distressed-and-defaulted-bonds-portfolios-continued-outperformance-in-2014. Accessed on 29 July 2018
Altman EI, Kalotay EA (2014) Ultimate recovery mixtures. J Bank Financ 40:116–129
Google Scholar
Altman EI, Kishore VM (1996) Almost everything you wanted to know about recoveries on defaulted bonds. Financ Anal J 52:57–64
Google Scholar
Altman EI, Brady B, Resti A, Sironi A (2005) The link between default and recovery rates: theory, empirical evidence, and implications. J Business 78:2203–2228
Google Scholar
Bastos JA (2010) Forecasting bank loans loss-given-default. J Bank Financ 34:2510–2517
Google Scholar
Bastos JA (2014) Ensemble predictions of recovery rates. J Financ Serv Res 46:177–193
Google Scholar
Bellotti T, Crook J (2012) Loss given default models incorporating macroeconomic variables for credit cards. Int J Forecast 28:171–182
Google Scholar
Calabrese R (2014) Predicting bank loan recovery rates with mixed continuous-discrete model. Appl Stoch Model Bus Ind 30:99–114
Google Scholar
Calabrese R, Zenga M (2010) Bank loan recovery rates: measuring and nonparametric density estimation. J Bank Financ 34:903–911
Google Scholar
Caselli S, Gatti S, Querci F (2008) The sensitivity of the loss given default rate to systematic risk: new empirical evidence on bank loans. J Financ Serv Res 34:1–34
Google Scholar
Chava S, Stefanescu C, Turnbull S (2011) Modeling the loss distribution. Manag Sci 57:1267–1287
Google Scholar
Chu CK, Hwang RC (2019) Predicting loss distributions for small-size defaulted-debt portfolios using a convolution technique that allows probability masses to occur at boundary points. J Financ Serv Res 56:95–117
Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38
Google Scholar
Dermine J, de Carvalho CN (2006) Bank loan losses-given-default: a case study. J Bank Financ 30:1219–1243
Google Scholar
Duan JC, Hwang RC (2016) Predicting recovery rate at the time of corporate default. http://www.rmi.nus.edu.sg/duanjc. Accessed on 29 July 2018
Duffie D, Gârleanu N (2001) Risk and valuation of collateralized debt obligations. Fin Anal J 57:41–59
Google Scholar
Ferrari S, Pinheiro EC (2011) Improved likelihood inference in beta regression. J Stat Comput Simul 81:431–443
Google Scholar
Friedman CA, Sandow S (2005) Estimating conditional probability distributions of recovery rates: a utility-based approach. In: Altman EI, Resti A, Sironi A (eds) Recovery risk: the next challenge in credit risk management. Risk Books, London, pp 347–360
Google Scholar
Frye J (2000) Collateral damage. Risk 13:91–94
Google Scholar
Hannan EJ, Quinn BG (1979) The determination of the order of an autoregression. J R Stat Soc Ser B 41:190–195
Google Scholar
Harris MN, Zhao X (2007) A zero-inflated ordered probit model, with an application to modelling tobacco consumption. J Econom 141:1073–1099
Google Scholar
Hartmann-Wendels T, Miller P, Tows E (2014) Loss given default for leasing: parametric and nonparametric estimations. J Bank Financ 40:364–375
Google Scholar
Hillegeist SA, Keating EK, Cram DP, Lundstedt KG (2004) Assessing the probability of bankruptcy. Rev Acc Stud 9:5–34
Google Scholar
Hwang RC (2012) A varying-coefficient default model. Int J Forecast 28:675–688
Google Scholar
Hwang RC, Chu CK (2018) A logistic regression point of view toward loss given default distribution estimation. Quan Financ 18:419–435
Google Scholar
Hwang RC, Chung H, Chu CK (2010) Predicting issuer credit ratings using a semiparametric method. J Emp Financ 17:120–137
Google Scholar
Hwang RC, Chung H, Chu CK (2016) A two-stage probit model for predicting recovery rates. J Financ Serv Res 50:311–339
Google Scholar
Jarrow R, Lando D, Yu F (2005) Default risk and diversification: theory and empirical implications. Math Financ 15:1–26
Google Scholar
Kalotay EA, Altman EI (2017) Intertemporal forecasts of defaulted bond recoveries and portfolio losses. Rev Financ 21:433–463
Google Scholar
Keisman D, Van de Castle K (1999) Recovering your money: insights into losses from defaults. Stan Poor’s Credit Week 16:29–34
Google Scholar
Kibria BMG, Månsson K, Shukur G (2013) Some ridge regression estimators for the zero-inflated Poisson model. J Appl Stat 40:721–735
Google Scholar
Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14
Google Scholar
Lando D, Nielsen MS (2010) Correlation in corporate defaults: contagion or conditional independence? J Financ Intermediation 19:355–372
Google Scholar
Li P, Qi M, Zhang X, Zhao X (2016) Further investigation of parametric loss given default modeling. J Credit Risk 12:17–47
Google Scholar
Lin TH, Tsai MH (2013) Modeling health survey data with excessive zero and K responses. Stat Med 32:1572–1583
Google Scholar
Loterman G, Brown I, Martens D, Mues C, Baesens B (2012) Benchmarking regression algorithms for loss given default modeling. Int J Forecast 28:161–170
Google Scholar
Maalouf M, Siddiqi M (2014) Weighted logistic regression for large-scale imbalanced and rare events data. Knowl-Based Syst 59:142–148
Google Scholar
Oliveira MR, Louzada F, Pereira GHA, Moreira F, Calabrese R (2015) Inflated mixture models: applications to multimodality in loss given default. http://ssrn.com/abstract=2634919. Accessed on 29 July 2018
Oliveira MR, Moreira F, Louzada F (2017) The zero-inflated promotion cure rate model applied to financial data on time-to-default. Cogent Econ Financ 5:1395950
Google Scholar
Ospina R, Ferrari SLP (2010) Inflated beta distributions. Stat Papers 51:111–126
Google Scholar
Qi M, Zhao X (2011) Comparison of modeling methods for loss given default. J Bank Financ 35:2842–2855
Google Scholar
Resti Y, Ismail N, Jamaan SH (2013) Estimation of claim cost data using zero adjusted gamma and inverse Gaussian regression models. J Math Stat 9:186–192
Google Scholar
Rose CE, Martin SW, Wannemuehler KA, Plikaytis BD (2006) On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data. J Biopharm Stat 16:463–481
Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Google Scholar
Serfling R (1980) Approximation theorems of mathematical statistics. Wiley, New York
Google Scholar
Siao JS, Hwang RC, Chu CK (2016) Predicting recovery rates using logistic quantile regression with bounded outcomes. Quan Financ 16:777–792
Google Scholar
Sigrist F, Stahel WA (2011) Using the censored gamma distribution for modeling fractional response variables with an application to loss given default. ASTIN Bulletin 41:673–710
Google Scholar
Thomas LC, Matuszyk A, Moore A (2012) Comparing debt characteristics and LGD models for different collections policies. Int J Forecast 28:196–203
Google Scholar
Tong ENC, Mues C, Thomas L (2013) A zero-adjusted gamma model for mortgage loan loss given default. Int J Forecast 29:548–562
Google Scholar
Unal H, Madan D, Güntay L (2003) Pricing the risk of recovery in default with absolute priority rule violation. J Bank Financ 27:1001–1025
Google Scholar
Yang S, Harlow LL, Puggioni G, Redding CA (2017) A comparison of different methods of zero-inflated data analysis and an application in health surveys. J Mod App Stat Meth 16:518–543
Google Scholar
Yashkir O, Yashkir Y (2013) Loss given default modeling: a comparative analysis. J Risk Model Valid 7:25–59
Google Scholar

Download references

Acknowledgements

The authors thank the reviewers and the editor for their valuable comments and suggestions that have greatly improved the presentation of this paper. The Ministry of Science and Technology of Taiwan provides support for this research.

Author information

Authors and Affiliations

Department of Finance, National Dong Hwa University, Hualien, Taiwan
Ruey-Ching Hwang
Department of Applied Mathematics, National Dong Hwa University, Hualien, Taiwan
Chih-Kang Chu
School of Statistics, Southwestern University of Finance and Economics, Chengdu, China
Kaizhi Yu

Authors

Ruey-Ching Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Kang Chu
View author publications
You can also search for this author in PubMed Google Scholar
Kaizhi Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruey-Ching Hwang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A sketch of the proof

To decompose ℓ_ZCBR(δ_{0, 1}, ρ₁, α_{1, 1}, β_{1, 1}, α_{2, 1}, β_{2, 1}, η₁), we first replace its components p_{ZCBR, 0}(x_i) and p_{ZCBR, 1}(x_i) with δ₀(x_i) and η{1 − δ₀(x_i)}[1 − Ω{(1 + ρ)⁻¹; α₁(x_i), β₁(x_i)}], respectively, and rearrange the result as:

$$ {\displaystyle \begin{array}{c}{\ell}_{ZCBR}\left({\delta}_{0,1},{\rho}_1,{\alpha}_{1,1},{\beta}_{1,1},{\alpha}_{2,1},{\beta}_{2,1},{\eta}_1\right)\\ {}={\sum}_{i=1}^n\ln \left(\begin{array}{c}{\delta}_0{\left({x}_i\right)}^{I\left({y}_i=0\right)}{\left[\left\{1-{\delta}_0\left({x}_i\right)\right\}{f}_{ZCBR,1}\left({y}_i|{x}_i\right)\right]}^{I\left\{{y}_i\in \left(0,1\right)\right\}}\times \\ {}{\eta}^{I\left({y}_i=1\right)}{\left\{1-{\delta}_0\left({x}_i\right)\right\}}^{I\left({y}_i=1\right)}{\left[1-\varOmega \left\{\frac{1}{1+\rho };{\alpha}_1\left({x}_i\right),{\beta}_1\left({x}_i\right)\right\}\right]}^{I\left({y}_i=1\right)}\end{array}\right)\end{array}}={L}_1+{L}_2, $$

where

$$ {\displaystyle \begin{array}{c}{L}_1={\sum}_{i=1}^n\ln \left[{\delta}_0{\left({x}_i\right)}^{I\left({y}_i=0\right)}{\left\{1-{\delta}_0\left({x}_i\right)\right\}}^{I\left\{{y}_i\in \left(0,1\right)\right\}}{\left\{1-{\delta}_0\left({x}_i\right)\right\}}^{I\left({y}_i=1\right)}\right],\\ {}{L}_2={\sum}_{i=1}^n\ln \left({f}_{ZCBR,1}{\left({y}_i|{x}_i\right)}^{I\left\{{y}_i\in \left(0,1\right)\right\}}{\eta}^{I\left({y}_i=1\right)}{\left[1-\varOmega \Big\{\frac{1}{1+\rho };{\alpha}_1\left({x}_i\right),{\beta}_1\left({x}_i\right)\Big\}\right]}^{I\left({y}_i=1\right)}\right).\end{array}} $$

Then, through a straightforward calculation, the quantities L₁ and L₂ become:

$$ {\displaystyle \begin{array}{c}{L}_1={\sum}_{i=1}^n\ln \left[{\delta}_0{\left({x}_i\right)}^{I\left({y}_i=0\right)}{\left\{1-{\delta}_0\left({x}_i\right)\right\}}^{I\left({y}_i>0\right)}\right]\\ {}={\sum}_{i=1}^n\left[I\left({y}_i=0\right)\ln \left\{{\delta}_0\left({x}_i\right)\right\}+\left\{1-I\left({y}_i=0\right)\right\}\ln \left\{1-{\delta}_0\left({x}_i\right)\right\}\right]\\ {}={\sum}_{i=1,{y}_i=0}^n\ln \left[{\delta}_0\left({x}_i\right)/\left\{1-{\delta}_0\left({x}_i\right)\right\}\right]+{\sum}_{i=1}^n\ln \left\{1-{\delta}_0\left({x}_i\right)\right\}\equiv {\ell}_{ZCBR,1}\left({\delta}_{0,1}\right),\end{array}} $$

The proof for the decomposition of ℓ_ZCBR(δ_{0, 1}, ρ₁, α_{1, 1}, β_{1, 1}, α_{2, 1}, β_{2, 1}, η₁) is complete.

An EM algorithm

To describe the EM algorithm for finding the maximizer of ℓ_{ZCBR, 2}(ρ₁, α_{1, 1}, β_{1, 1}, α_{2, 1}, β_{2, 1}, η₁), we set the following notation. Let

$$ {\displaystyle \begin{array}{l}{p}_1\left(x,{\theta}_1\right)=1-\varOmega \left\{\frac{1}{1+\rho };{\alpha}_1(x),{\beta}_1(x)\right\},\\ {}{f}_{M,1}\left(y|x,{\theta}_1\right)={p}_1{\left(x,{\theta}_1\right)}^{I\left(y=1\right)}{\left[\frac{1}{1+\rho}\times \omega \left\{\frac{y}{1+\rho };{\alpha}_1(x),{\beta}_1(x)\right\}\right]}^{I\left\{y\in \left(0,1\right)\right\}},\\ {}{f}_{M,2}\left(y|x,{\theta}_2\right)=\omega \left\{y;{\alpha}_2(x),{\beta}_2(x)\right\},\end{array}} $$

where ρ = ln{1 + exp(ρ₁)}, θ₁ = (ρ₁, α_{1, 1}, β_{1, 1}), and θ₂ = (α_{2, 1}, β_{2, 1}). Using these results, the conditional mixture distribution of the positive LGD data is:

$$ {f}_M\left(y|x,\theta \right)=\eta {f}_{M,1}\left(y|x,{\theta}_1\right)+\left(1-\eta \right){f}_{M,2}\left(y|x,{\theta}_2\right), $$

where y ∈ (0, 1], $ \eta =\frac{\exp \left({\eta}_1\right)}{1+\exp \left({\eta}_1\right)}, $ and θ = (θ₁, θ₂, η₁) = (ρ₁, α_{1, 1}, β_{1, 1}, α_{2, 1}, β_{2, 1}, η₁).

To apply the EM algorithm to maximize $ {\ell}_{ZCBR,2}\left(\theta \right)={\sum}_{i=1,{y}_i>0}^n\ln \left\{{f}_M\left({y}_i|{x}_i,\theta \right)\right\}, $ we introduce a set of dummy latent data z_i that are indicator variables linking observations y_i > 0 to the mixture components f_{M, 1}(y_i| x_i, θ₁) and f_{M, 2}(y_i| x_i, θ₂). Thus, we write the log-likelihood function of the observed and latent data as:

$$ {\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right)={\sum}_{i=1,{y}_i>0}^n\ln \left\{{f}_M^{\ast}\left({y}_i,{z}_i|{x}_i,\theta \right)\right\}\equiv {\ell}_1\left({\theta}_1\right)+{\ell}_2\left({\theta}_2\right)+{\ell}_3\left({\eta}_1\right), $$

where

$$ {\displaystyle \begin{array}{c}{f}_M^{\ast}\left({y}_i,{z}_i|{x}_i,\theta \right)={\left\{\eta {f}_{M,1}\left({y}_i|{x}_i,{\theta}_1\right)\right\}}^{z_i}{\left\{\left(1-\eta \right){f}_{M,2}\left({y}_i|{x}_i,{\theta}_2\right)\right\}}^{\left(1-{z}_i\right)},\\ {}{\ell}_1\left({\theta}_1\right)={\sum}_{i=1,{y}_i>0}^n{z}_i\ln \left\{{f}_{M,1}\left({y}_i|{x}_i,{\theta}_1\right)\right\},\\ {}{\ell}_2\left({\theta}_2\right)={\sum}_{i=1,{y}_i\in \left(0,1\right)}^n\left(1-{z}_i\right)\ln \left\{{f}_{M,2}\left({y}_i|{x}_i,{\theta}_2\right)\right\},\\ {}{\ell}_3\left({\eta}_1\right)={\sum}_{i=1,{y}_i>0}^n\left\{{z}_i\ln \left(\eta \right)+\left(1-{z}_i\right)\ln \left(1-\eta \right)\right\}.\end{array}} $$

The EM algorithm for maximizing $ {\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right) $ comprises the following two-step procedure. The first step of the algorithm involves making an expectation:

$$ E\left\{{\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right)|{\theta}^{(k)}\right\}\equiv {\tilde{\ell}}_1\left({\theta}_1\right)+{\tilde{\ell}}_2\left({\theta}_2\right)+{\tilde{\ell}}_3\left({\eta}_1\right). $$

Here $ {\theta}^{(k)}=\left\{{\theta}_1^{(k)},{\theta}_2^{(k)},{\eta}_1^{(k)}\right\} $ and $ {\tilde{\ell}}_1\left({\theta}_1\right), $ $ {\tilde{\ell}}_2\left({\theta}_2\right), $ and $ {\tilde{\ell}}_3\left({\eta}_1\right) $ are ℓ₁(θ₁), ℓ₂(θ₂), and ℓ₃(η₁) with z_i replaced by $ {\tilde{z}}_i. $ The quantity $ {\tilde{z}}_i $ has the formula:

$$ {\tilde{z}}_i=E\left\{Z|{y}_i,{x}_i,{\theta}^{(k)}\right\}=\frac{\eta^{(k)}{f}_{M,1}\left({y}_i|{x}_i,{\theta}_1^{(k)}\right)}{\eta^{(k)}{f}_{M,1}\left({y}_i|{x}_i,{\theta}_1^{(k)}\right)+\left\{1-{\eta}^{(k)}\right\}{f}_{M,2}\left({y}_i|{x}_i,{\theta}_2^{(k)}\right)}, $$

for each i = 1, ⋯, n, where $ {\eta}^{(k)}=\frac{\exp \left\{{\eta}_1^{(k)}\right\}}{1+\exp \left\{{\eta}_1^{(k)}\right\}}. $ The second step of the algorithm involves maximizing $ E\left\{{\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right)|{\theta}^{(k)}\right\} $ to obtain $ {\theta}^{\left(k+1\right)}=\arg {\max}_{\theta }E\left\{{\ell}_{ZCBR,2}^{\ast}\left(\theta, {z}_1,\cdots, {z}_n\right)|{\theta}^{(k)}\right\}. $ It is separately performed by finding $ {\theta}_1^{\left(k+1\right)}=\arg {\max}_{\theta_1}{\tilde{\ell}}_1\left({\theta}_1\right), $ $ {\theta}_2^{\left(k+1\right)}=\arg {\max}_{\theta_2}{\tilde{\ell}}_2\left({\theta}_2\right), $ and $ {\eta}_1^{\left(k+1\right)}=\arg {\max}_{\eta_1}{\tilde{\ell}}_3\left({\eta}_1\right). $^{Footnote 17} Thus, $ {\theta}^{\left(k+1\right)}=\left\{{\theta}_1^{\left(k+1\right)},{\theta}_2^{\left(k+1\right)},{\eta}_1^{\left(k+1\right)}\right\}. $ The two-step procedure continues until the convergence criterion ‖θ^(k + 1) − θ^(k)‖ < 0.005 is satisfied.^{Footnote 18} The notation ‖ψ‖ denotes the Euclidean length of the given vector ψ.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hwang, RC., Chu, CK. & Yu, K. Predicting the Loss Given Default Distribution with the Zero-Inflated Censored Beta-Mixture Regression that Allows Probability Masses and Bimodality. J Financ Serv Res 59, 143–172 (2021). https://doi.org/10.1007/s10693-020-00333-w

Download citation

Received: 13 October 2018
Revised: 18 December 2019
Accepted: 04 February 2020
Published: 18 March 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s10693-020-00333-w

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting the Loss Given Default Distribution with the Zero-Inflated Censored Beta-Mixture Regression that Allows Probability Masses and Bimodality

Abstract

Access this article

Similar content being viewed by others

Machine learning-driven credit risk: a systemic review

Algorithmic discrimination in the credit domain: what do we know about it?

Forecasting gold price with the XGBoost algorithm and SHAP interaction values

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

A sketch of the proof

An EM algorithm

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Predicting the Loss Given Default Distribution with the Zero-Inflated Censored Beta-Mixture Regression that Allows Probability Masses and Bimodality

Abstract

Access this article

Similar content being viewed by others

Machine learning-driven credit risk: a systemic review

Algorithmic discrimination in the credit domain: what do we know about it?

Forecasting gold price with the XGBoost algorithm and SHAP interaction values

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

A sketch of the proof

An EM algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation