Abstract
Credit scoring is one of the key problems in financial risk managements. This paper studies the credit scoring problem based on the set-valued identification method, which is used to explain the relation between the individual attribute vectors and classification for the credit worthy and credit worthless lenders. In particular, system parameters are estimated by the set-valued identification algorithm based on a given recognition criteria. In order to illustrate the efficiency of the proposed method, practical experiments are conducted for credit card applicants of Australia and credit card holders from Taiwan, respectively. The empirical results show that the set-valued model has a higher prediction accuracy on both small and large numbers of data set compared with logistic regression model. Furthermore, parameters estimated by the set-valued identification method are more stable, which provide a meaningful and logical explanation for extracting factors that influence the borrowers’ credit scorings.
Similar content being viewed by others
References
Beaver W H, Financial ratios as predictors of failure, Journal of Accounting Research, 1966, 4: 71–111.
Boyes W J, Hoffman D L, and Low S A, An econometric analysis of the bank credit scoring problem, Journal of Econometrics, 1989, 40(1): 3–14.
Mays E, Handbook of Credit Scoring, Glenlake Publishing Company, Ltd, Chicago, 2001.
Abdelmoula A K, Bank credit risk analysis with k-nearest-neighbor classifier: Case of Tunisian banks, Accounting and Management Information Systems, 2015, 14(1): 79.
Morgan J P, Creditmetrics-Technical Document, JP Morgan, New York, 1997.
Suisse C, CreditRisk+: A credit risk management framework, Credit Suisse Financial Products, 1997, 18–53.
Crosbie P and Bohn J, Modeling default risk, Technical Report, KMV, LLC, 2003.
Wang X, He X, Bao Y, et al., Parameter estimates of Heston stochastic volatility model with MLE and consistent EKF algorithm, Science China Information Sciences, 2018, 61: 042202.
Karaa A and Krichéne A, Credit risk assessment using support vectors machine and multilayer neural network models: A comparative study case of a Tunisian bank, Accounting and Management Information Systems, 2012, 11(4): 587–620.
Louzada F, Ara A, and Fernandes G B, Classification methods applied to credit scoring: Systematic review and overall comparison, Surveys in Operations Research and Management Science, 2016, 21(2): 117–134.
Xu X, Zhou C, and Wang Z, Credit scoring algorithm based on link analysis ranking with support vector machine, Expert Systems with Applications, 2009, 36(2): 2625–2632.
Wang X, Bao Y, and Zhao Y, Arbitrage-free conditions for implied volatility surface by Delta, The North American Journal of Economics and Finance, https://doi.org/10.1016/j.najef.2018.08.011.
Bennell J A, Crabbe D, Thomas S, et al., Modelling sovereign credit ratings: Neural networks versus ordered probit, Expert Systems with Applications, 2006, 30(3): 415–425.
Nehrebecka N, Predicting the default risk of companies, comparison of credit scoring models: Logit vs support vector machines, Econometrics, 2018, 22(2): 54–73.
Zhao Y, Zhang J F, and Guo J, System identification and adaptive control of set-valued systems, Journal of Systems Science and Mathematical Science, 2012, 32(10): 1257–1265.
Guo J, Zhang J F, and Zhao Y, Adaptive tracking of a class of first-order systems with binary-valued observations and fixed thresholds, Journal of Systems Science and Complexity, 2012, 25(6): 1041–1051.
Bi W, Zhao Y, Liu C, et al., Set-valued analysis for genome-wide association studies of complex diseases, The 32nd Chinese Control Conference (CCC), 2013, 8262–8267.
Han J, Pei J, and Kamber M, Data Mining: Concepts and Techniques, Morgan Kaufmann, San Fransisco, 2011.
Henley W E and Hand D J, Statistical classification methods in consumer credit scoring: A review, Journal of the Royal Statistical Society, Series A (Statistics in Society), 1997, 160(3): 523–541.
Marques A I, García V, and Sánchez J S, A literature review on the application of evolutionary computing to credit scoring, Journal of the Operational Research Society, 2013, 64(9): 1384–1399.
Hosmer Jr. D W, Lemeshow S, and Sturdivant R X, Applied Logistic Regression, John Wiley & Sons, Inc., Hoboken, New Jersey, 2013.
Bolton C, Logistic Regression and Its Application in Credit Scoring, University of Pretoria, Pretoria, 2009.
Yeh I C and Lien C, The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients, Expert Systems with Applications, 2009, 36(2): 2473–2480.
Roubos J A, Setnes M, and Abonyi J, Learning fuzzy classification rules from labeled data, Information Sciences, 2003, 150(1–2): 77–93.
Morales M H, Rodríguez J T, and Montero J, Credit rating using fuzzy algorithms, Actas de la XVI Conferencia CAEPIA, Albacete, 2015, 539–548.
Yazdani H and Kwasnicka H, Fuzzy classification method in credit risk, International Conference on Computational Collective Intelligence, Springer, Berlin, Heidelberg, 2012, 495–504.
Galindo J and Tamayo P, Credit risk assessment using statistical and machine learning: Basic methodology and risk modeling applications, Computational Economics, 2000, 15(1–2): 107–143.
Paolo G, Bayesian data mining, with application to benchmarking and credit scoring, Applied Stochastic Models in Business and Society, 2011, 17: 69–81.
Sharma D, Improving the art, craft and science of economic credit risk scorecards using random forests: Why credit scorers and economists should use random forests, Academy of Banking Studies Journal, 2012, 11(1): 93–116.
Pacelli V and Azzollini M, An artificial neural network approach for credit risk management, Journal of Intelligent Learning Systems and Applications, 2011, 3(2): 103.
Hand J and Henley W, Statistical classification methods in consumer credit scoring, Computer Journal of the Royal Statistical Society Series a Statistics in Society, 1997, 160(3): 523–541.
West D, Neural network credit scoring, Computer & Operations Research, 2000, 27(11): 1131–1152.
Abdou H A and Pointon J, Credit scoring, statistical techniques and evaluation criteria: A review of the literature, Intelligent Systems in Accounting, Finance and Management, 2011, 18(2–3): 59–88.
Berry M and Linoff G, Mastering Data Mining: The Art and Science of Customer Relationship Management, John Wiley & Sons, Inc, New York, 2000.
Miguéis V L, Benoit D F, and Van den Poel D, Enhanced decision support in credit scoring using Bayesian binary quantile regression, Journal of the Operational Research Society, 2013, 64(9): 1374–1383.
Baesens B, Van Gestel T, Viaene S, et al., Benchmarking state-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society, 2003, 54(6): 627–635.
Bi W and Zhao Y, Iterative parameter estimate with batched binary-valued observations: Convergence with an exponential rate, The 19th World Congress of the International Federation of Automatic Control, 2014.
Murphy P M and Aha D W, UCI repository of machine learning databases, Department of Information and Computer Science, University of California, Irvine, CA, http://www.ics.uci.edu/mlearn/LRepository.html, 2001.
Ruxton G D, The unequal variance t-test is an underused alternative to student’s t-test and the Mann-Whitney U test, Behavioral Ecology, 2006, 17(4): 688–690.
Everitt B S, The Analysis of Contingency Tables, Chapman and Hall/CRC, London, 1992.
Hardy M A, Regression with Dummy Variables, Sage, Newbury Park, California, 1993.
Suits D B, Dummy variables: Mechanics vs interpretation, The Review of Economics and Statistics, 1984, 177–180.
Wang X, Djehiche B, and Hu X, credit rating analysis based on the network of trading information, Journal of Network Theory in Finance, 2019, 5(1): 47–65.
St L and Wold S, Analysis of variance (ANOVA), Chemometrics and Intelligent Laboratory Systems, 1989, 6(4): 259–272.
Acknowledgment
We thank the anonymous researcher and I-Cheng Yeh in the department of information management, Chung Hua University, Taiwan and department of civil engineering, Tamkang University, Taiwan for sharing the data in UCI machining learning repository. We are also grateful for the tutorial of Natalino Busa, the chief data officer of Teko Ventures for the data analysis and credit scoring prediction.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by the National Key R&D Program of China under Grant No. 2018YFA0703800, the National Natural Science Foundation of China under Grant No. 61622309, and the Verg Foundation (Sweden).
This paper was recommended for publication by Editor LIU Yungang.
Rights and permissions
About this article
Cite this article
Wang, X., Hu, M., Zhao, Y. et al. Credit Scoring Based on the Set-Valued Identification Method. J Syst Sci Complex 33, 1297–1309 (2020). https://doi.org/10.1007/s11424-020-9101-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-020-9101-4