Skip to main content
Log in

Inefficiency source tracking: evidence from data envelopment analysis and random forests

  • S.I.: Regression Methods based on OR techniques
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

In the present era of complex environments, banks operate in a more dynamic environment, which in turn, affects their relative efficiency. Traditional Data envelopment analysis (DEA) models are widely used to measure efficiency. However, environmental/exogenous variables can significantly influence the DEA efficiency scores. Therefore, identifying the most important environmental variables is crucial in the evaluation of bank performances. This study introduces a three-stage DEA framework that employs a random forest as a powerful ensemble method for variable selection to search for the most influential environmental variables. The direction of influence of the selected environmental variables and their predictive power for predicting bank performances are investigated in the third stage, through a regression analysis. The proposed framework is tested with a sample of 110 banks in Middle East and North Africa countries, observed over a period of 3 years (2014 till 2016). Accordingly, a relevant set of environmental variables is identified and its effects on bank efficiency are studied. The findings indicate that the country where the bank operates has a significant effect on the bank’s efficiency. Results also show that the overall average efficiency score is stable (around 87%) for all banks. The study concludes with the limitations and suggested directions for further research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Egypt, 19 banks; Lebanon, 18 banks; United Arab Emirates, 17 banks; Saudi Arabia, 11 banks; Jordan 10 banks; Oman and Bahrain 7 banks each; Qatar and Tunisia, 6 banks each; Iran 5 banks and Kuwait 4 banks.

  2. The excluded environmental variables are:

    Price earning index (P/E) It helps in evaluating the attractiveness of an investment. It is calculated as last closing price divided by latest trailing four-quarter earnings per share.

    Price book value It is the ratio of last closing price to the latest book value and indicates growth prospects.

    Beta is a relative measure of the systematic return of the stock to the overall market. Stocks with betas greater than 1.0 are highly volatile and have a positive correlation with the market; such stocks are termed aggressive securities.

    Capital structure (E/D) refers to the way a bank finances itself through some combination of equity

    Market share shows the extent of bank’s risks, as higher ratios of loans to total assets reveal the aggression of lending by the bank to increase profits.

References

  • Aguenaou, S., Lahrech, A., & Bounakaya, S. (2017). Analyzing banks’ efficiency as a measurement of performance in the Moroccan context: Application of CAMEL framework. International Review of Research in Emerging Markets and the Global Economy, 3(1), 1105–1121.

    Google Scholar 

  • Alandejani, M. (2014). Efficiency, survival, and non-performing loans in islamic and conventional banking in the GCC. Unpublished PhD theses, Durham University. http://etheses.dur.ac.uk/10884/.

  • Albaity, M., Mallek, R., & Noman, A. H. (2019). Competition and bank stability in the MENA region: The moderating effect of Islamic versus conventional banks. Emerging Markets Review, 38(March), 310–325.

    Article  Google Scholar 

  • Alharthi, M. (2016). A comparative study of efficiency and its determinants in Islamic, conventional, and socially responsible banks. Corporate Ownership and Control, 13(4), 470–482.

    Article  Google Scholar 

  • Alharthi, M. (2017). Factors influencing efficiency of Islamic banks in GCC region: Evidence from Arab spring period. Corporate Ownership & Control, 14(3), 345–353.

    Article  Google Scholar 

  • Anouze, A. L. (2010). Evaluating productive efficiency: comparative study of commercial banks in Gulf countries. Unpublished PhD thesis, Aston Business School, Aston University.

  • Anouze, A. L., & Bou-Hamad, I. (2019). Data envelopment analysis and data mining to efficiency estimation and evaluation. International Journal of Islamic and Middle Eastern Finance and Management, 12(2), 169–190.

    Article  Google Scholar 

  • Ariff, M., & Can, L. (2008). Cost and profit efficiency of Chinese banks: A non-parametric analysis. China Economic Review, 19(2), 260–273.

    Article  Google Scholar 

  • Azadeh, A., Saberi, M., Moghaddam, R., & Javanmardi, L. (2011). An integrated data envelopment analysis-artificial neural network-rough set algorithm for assessment of personnel efficiency. Expert Systems with Applications, 38(3), 1364–1373.

    Article  Google Scholar 

  • Azen, R., & Budescu, D. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8(2), 129–148.

    Article  Google Scholar 

  • Bahrini, R. (2017). Efficiency analysis of Islamic banks in the Middle East and North Africa region: A bootstrap DEA approach. International Journal of Financial Studies, 5(1), 1–13.

    Article  Google Scholar 

  • Banker, R., Charnes, A., & Cooper, W. (1984). Some Models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, 30(9), 1078–1092.

    Article  Google Scholar 

  • Banker, R., & Morey, C. (1986a). Efficiency analysis for exogenously fixed inputs and outputs. Operations Research, 34(4), 513–521.

    Article  Google Scholar 

  • Banker, R., & Morey, C. (1986b). The use of categorical variables in data envelopment analysis. Management Science, 32(12), 1613–1627.

    Article  Google Scholar 

  • Banker, R., & Natarajan, R. (2008). Evaluating contextual variables affecting productivity using data envelopment analysis. Operations Research, 56(1), 48–58.

    Article  Google Scholar 

  • Banker, R., Natarajan, R., & Zhang, D. (2019). Two-stage estimation of the impact of contextual variables in stochastic frontier production function models using data envelopment analysis: second stage OLS versus bootstrap approaches. European Journal of Operational Research, 278(2), 368–384.

    Article  Google Scholar 

  • Ben Naceura, S., Ben-Khedhirib, H., & Casuc, B. (2011). What drives the performance of selected MENA banks? A meta-frontier analysis. International Monetary Fund Working Paper, WP/11/34, IMF Institute.

  • Berger, A., & Patti, E. (2006). Capital structure and firm performance: A new approach to testing agency theory and an application to the banking industry. Journal of Banking & Finance, 30(4), 1065–1102.

    Article  Google Scholar 

  • Bi, G., Feng, C., Ding, J., & Khan, M. (2012). Estimating relative efficiency of DMU: Pareto principle and Monte Carlo oriented DEA approach. INFOR Information Systems and Operational Research, 50(1), 44–57.

    Article  Google Scholar 

  • Bou-Hamad, I. (2017). Bayesian credit ratings: A random forest alternative approach. Communications in Statistics-Theory and Methods, 46(15), 7289–7300.

    Article  Google Scholar 

  • Bou-Hamad, I., Anouze, A. L., & Larocque, D. (2017). An integrated approach of data envelopment analysis and boosted generalized linear mixed models for efficiency assessment. Annals of Operations Research, 253(1), 77–95.

    Article  Google Scholar 

  • Bou-Hamad, I., & Jamali, I. (2020). Forecasting financial time-series using data mining models: A simulation study. Research in International Business and Finance, 51(3), 101072.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  • Breiman, L. (2003). Manual for setting up, using and understanding Random Forest V4.0, Retrieved at: http://oz.berkeley.edu/users/breiman/Using_random_forests_v4.0.pdf.

  • Breiman, L., & Cutler, A. (2016). Random Forests for Scientific Discovery. [línea]. https://www.statberkeley.edu/~breiman/RandomForests/berkeley_files/frame.htm.‏

  • Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Boca Raton: CRC Press.

    Google Scholar 

  • Casu, B., & Molyneux, P. (2003). A comparative study of efficiency in European banking. Applied Economics, 35(17), 1865–1876.

    Article  Google Scholar 

  • Chang, T., Hu, J., Chou, R., & Sun, L. (2012). The sources of bank productivity growth in China during 2002–2009: A disaggregation view. Journal of Banking & Finance, 36(7), 1997–2006.

    Article  Google Scholar 

  • Charnes, A., Cooper, W., Lewin, A., & Seiford, L. (1997). Data envelopment analysis theory, methodology and applications. Journal of the Operational Research Society, 48(3), 332–333.

    Article  Google Scholar 

  • Charnes, A., Cooper, W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2(6), 429–444.

    Article  Google Scholar 

  • Charnes, A., Cooper, W., & Rhodes, E. (1981). Evaluating program and managerial efficiency: an application of data envelopment analysis to program follow through. Management Science, 27(6), 668–697.

    Article  Google Scholar 

  • Chen, S., Chen, X., & Zhong, R. (2005). The impact of government regulation and ownership on the performance of securities: Evidence from China. Global Finance Journal, 16(2), 113–124.

    Article  Google Scholar 

  • Couronné, R., Probst, P., & Boulesteix, A.-L. (2018). Random forest versus logistic regression: A large-scale benchmark experiment. BMC Bioinformatics, 19, 270. https://doi.org/10.1186/s12859-018-2264-5.

    Article  Google Scholar 

  • Courville, T., & Thompson, B. (2001). Use of structure coefficients in published multiple regression articles: is not enough. Educational and Psychological Measurement, 61(2), 229–248.

    Article  Google Scholar 

  • Darlington, R. (1968). Multiple regression in psychological research and practice. Psychological Bulletin, 69(3), 161–182.

    Article  Google Scholar 

  • Díaz-Uriarte, R., & Alvarez de Andrés, S. (2006). Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7(3), 1–13.

    Google Scholar 

  • Emrouznejad, A., & Anouze, A. L. (2010). Data envelopment analysis with classification and regression tree: a case of banking efficiency. Expert Systems, 27(4), 231–246.

    Article  Google Scholar 

  • Emrouznejad, A., & Yang, G. (2018). A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016. Socio-Economic Planning Sciences, 61(1), 4–8.

    Article  Google Scholar 

  • Estelle, S., Johnson, A., & Ruggiero, J. (2010). Three-stage DEA models for incorporating exogenous inputs. Computers & Operations Research, 37(6), 1087–1090.

    Article  Google Scholar 

  • Fang, K., Zhang, Q., Long, Y., Yoshida, Y., Sun, L., Zhang, H., et al. (2019). How can China achieve its intended nationally determined contributions by 2030? A multi-criteria allocation of China’s carbon emission allowance. Applied Energy, 241(May), 380–389.

    Article  Google Scholar 

  • Fethi, M., & Pasiouras, F. (2010). Assessing bank efficiency and performance with operational research and artificial intelligence techniques: A survey. European Journal of Operational Research, 204(2), 189–198.

    Article  Google Scholar 

  • Fried, H., Lovell, C., Schmidt, S., & Yaisawarng, S. (2002). Accounting for environmental effects and statistical noise in data envelopment analysis. Journal of Productivity Analysis, 17(1–2), 157–174.

    Article  Google Scholar 

  • Fried, H., Schmidt, S., & Yaisawarng, S. (1995). Incorporating the operating environment into a measure of technical efficiency. Canberra: Paper presented to the Bureau of Industry Economics Seminar.

    Google Scholar 

  • Fried, H., Schmidt, S., & Yaisawarng, S. (1999). Incorporating the operating environment into a nonparametric measure of technical efficiency. Journal of Productivity Analysis, 12(3), 249–267.

    Article  Google Scholar 

  • Grömping, U. (2007). Estimators of relative importance in linear regression based on variance decomposition. The American Statistician, 61(2), 139–147.

    Article  Google Scholar 

  • Hapfelmeier, A., Hothorn, T., Ulm, K., & Strobl, C. (2014). A new variable importance measure for random forests with missing data. Statistics and Computing, 24(1), 21–34.

    Article  Google Scholar 

  • Harrell, F., Lee, K., Califf, R., Pryor, D., & Rosati, R. (1984). Regression modelling strategies for improved prognostic prediction. Statistics Medical, 3(2), 143–152.

    Article  Google Scholar 

  • Harrell, F., Lee, K., & Mark, D. (1996). Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics Medical, 15(4), 361–387.

    Article  Google Scholar 

  • Hassan, M., & Aliyu, S. (2018). A contemporary survey of Islamic banking literature. Journal of Financial Stability, 34(1), 12–43.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. Statistics. Berlin: Springer.

    Book  Google Scholar 

  • He, L., Levine, R., Fan, J., Beemer, J., & Stronach, J. (2018). Random forest as a predictive analytics alternative to regression in institutional research practical assessment. Research & Evaluation, 23(1). https://www.pareonline.net/getvn.asp?v=23&n=1.

  • Hou, C.-E., Lu, W.-M., & Hung, S.-W. (2019). Does CSR matter? Influence of corporate social responsibility on corporate performance in the creative industry. Annals of Operations Research, 278(1–2), 255–279.

    Article  Google Scholar 

  • Hryckiewicz, A., & Kowalewski, O. (2010). Economic determinates; financial crisis and entry modes of foreign banks into emerging markets. Emerging Markets Review, 11(3), 205–228.

    Article  Google Scholar 

  • Hu, M., Zhang, J., & Chao, C. (2019). Regional financial efficiency and its non-linear effects on economic growth in China. International Review of Economics & Finance, 59, 193–206.

    Article  Google Scholar 

  • Janitza, S., Strobl, C., & Boulesteix, A. (2013). An AUC-based permutation variable importance measure for random forests. BMC Bioinformatics, 14(1), 1–11.

    Article  Google Scholar 

  • Johnson, J., & LeBreton, J. (2004). History and use of relative importance indices in organizational research. Organizational Research Methods, 7(3), 238–257.

    Article  Google Scholar 

  • Kaffash, S. Kazemi, Matin, R., & Tajik, M. (2018). A directional semi-oriented radial DEA measure: an application on financial stability and the efficiency of banks. Annals of Operations Research, 264(1–2), 213–234.

    Article  Google Scholar 

  • Kwon, H., & Lee, J. (2015). Two-stage production modeling of large US banks: A DEA-neural network approach. Expert Systems with Applications, 42(19), 6758–6766.

    Article  Google Scholar 

  • Lado-Sestayo, R., & Fernández-Castro, A. (2019). The impact of tourist destination on hotel efficiency: A data envelopment analysis approach. European Journal of Operational Research, 272(2), 674–686.

    Article  Google Scholar 

  • LeBreton, J., Hargis, M., Griepentrog, B., Oswald, F., & Ployhart, R. (2007). A multidimensional approach for evaluating variables in organizational research and practice. Personnel Psychology, 60(2), 475–498.

    Article  Google Scholar 

  • Ledolter, J., & Abraham, B. (1981). Parsimony and its importance in time series forecasting. Technometrics, 23(4), 411–414.

    Article  Google Scholar 

  • Lovell, C. (1994). Linear programming approaches to the measurement and analysis of productive efficiency. TOP, 2(2), 175–248.

    Article  Google Scholar 

  • Maghyereh, A., & Awartani, B. (2012). Financial integration of GCC banking markets: A non-parametric bootstrap DEA estimation approach. Research in International Business and Finance, 26(2), 181–195.

    Article  Google Scholar 

  • Mizuno, K., Toriyama, M., Terano, T., & Takayasu, M. (2008). Pareto law of the expenditure of a person in convenience stores. Physica A: Statistical Mechanics and its Applications, 387(15), 3931–3935.

    Article  Google Scholar 

  • Montillo, A. (2009). Random forests. Lecture in Statistical Foundations of Data Analysis. http://www.dabi.temple.edu/~hbling/8590.002/Montillo_RandomForests_4-2-2009.pdf.

  • Moons, K., Royston, P., Vergouwe, Y., Grobbee, D., & Altman, D. (2009). Prognosis and prognostic research: What, why, and how? BMJ, 338, 375.

    Article  Google Scholar 

  • Muchlinski, D., Siroky, D., He, J., & Kocher, M. (2016). Comparing random forest with logistic regression for predicting class-imbalanced civil war onset data. Political Analysis, 24(1), 87–103.

    Article  Google Scholar 

  • Nagaballi, S., & Kale, V. (2020). Pareto optimality and game theory approach for optimal deployment of DG in radial distribution system to improve techno-economic benefits. Applied Soft Computing, 92, 106234. https://doi.org/10.1016/j.asoc.2020.106234.

    Article  Google Scholar 

  • Nicholas, D., Rowlands, I., Huntington, P., Jamali, H., & Hernández Salazar, P. (2010). Diversity in the e-journal use and information-seeking behaviour of UK researchers. Journal of Documentation, 66(3), 409–433.

    Article  Google Scholar 

  • Ottenbacher, K., Ottenbacher, H., Tooth, L., & Ostir, G. (2004). A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions. Journal of Clinical Epidemiology, 57(11), 1147–1152.

    Article  Google Scholar 

  • Pareto, V. (1971). Manual of political economy. New York: A.M. Kelly.

    Google Scholar 

  • Pedhazur, E. (1997). Multiple regression in behavioral research. Orlando, FL: Harcourt Brace.

    Google Scholar 

  • Petropoulos, A., Siakoulis, V., Stavroulakis, E., & Vlachogiannakis, N. (2020). Predicting bank insolvencies using machine learning techniques. International Journal of Forecasting, 36(3), 1092–1113.

    Article  Google Scholar 

  • Rahman, M., Ashraf, B., Zheng, C., & Begum, M. (2017). Impact of cost efficiency on bank capital and the cost of financial intermediation: Evidence from BRICS Countries. International Journal of Financial Studies. https://doi.org/10.3390/ijfs5040032.

    Article  Google Scholar 

  • Ray, S. (1988). Data envelopment analysis, nondiscretionary inputs and efficiency: An alternative interpretation. Socio-Economic Planning Science, 22(4), 167–176.

    Article  Google Scholar 

  • Ray, S. (1991). Resource-use efficiency in public schools: A study of Connecticut data. Management Science, 37(12), 1620–1628.

    Article  Google Scholar 

  • Ruggiero, J. (1996). On the measurement of technical efficiency in the public sector. European Journal of Operational Research, 90(3), 553–565.

    Article  Google Scholar 

  • Sahut, J.-M., & Mili, M. (2011). Banking distress in MENA countries and the role of mergers as a strategic policy to resolve distress. Economic Modelling, 28(1–2), 138–146.

    Article  Google Scholar 

  • San, O., Theng, L., & Heng, T. (2011). A comparison on efficiency of domestic and foreign banks in Malaysia: A DEA approach. Business Management Dynamics, 1(4), 33–49.

    Google Scholar 

  • Seol, H., Choi, J., Park, G., & Park, Y. (2007). A framework for benchmarking service process using data envelopment analysis and decision tree. Expert Systems with Applications, 32(2), 432–440.

    Article  Google Scholar 

  • Simar, L., & Wilson, P. (2003). Statistical inference in non-parametric frontier models: The state of the art. Journal of Productivity Analysis, 13(1), 49–78.

    Article  Google Scholar 

  • Simar, L., & Wilson, P. (2007). Estimation and inference in two-stage, semi-parametric models of production processes. Journal of Econometrics, 136(1), 31–64.

    Article  Google Scholar 

  • Strobl, C., Boulesteix, A., Kneib, T., Augustin, T., & Zeileis, A. (2008). Conditional variable importance for random forests. BMC Bioinformatics, 9(1), 307.

    Article  Google Scholar 

  • Sun, L., & Chang, T. (2011). A comprehensive analysis of the effects of risk measures on bank efficiency: Evidence from emerging Asian countries. Journal of Banking & Finance, 35(7), 1727–1735.

    Article  Google Scholar 

  • Tanaka, K., Kinkyo, T., & Hamori, S. (2016). Random forests-based early warning system for bank failures. Economics Letters, 148, 118–121.

    Article  Google Scholar 

  • Thanassoulis, E. (2001). Introduction to the theory and application of data envelopment analysis. Dordrecht: Kluwer Academic Publishers.

    Book  Google Scholar 

  • Thanassoulis, E., Portela, M., & Despic, O. (2008). Data envelopment analysis: The mathematic programming approach to efficiency analysis. In H. Fried, C. Lovell, & S. Schmidt (Eds.), The measurement of productive efficiency and productivity growth. USA: Oxford University Press.

    Google Scholar 

  • Turk-Ariss, R. (2009). Competitive behavior in Middle East and North Africa banking systems. The Quarterly Review of Economics and Finance, 49(2), 693–710.

    Article  Google Scholar 

  • Wang, K., Huang, W., Wu, J., & Liu, Y.-N. (2014). Efficiency measures of the Chinese commercial banking system using an additive two-stage DEA. Omega, 44(April), 5–20.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdel Latef Anouze.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Anouze, A.L., Bou-Hamad, I. Inefficiency source tracking: evidence from data envelopment analysis and random forests. Ann Oper Res 306, 273–293 (2021). https://doi.org/10.1007/s10479-020-03883-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-020-03883-3

Keywords

Navigation