Does soft information determine credit risk? Text-based evidence from European banks

https://doi.org/10.1016/j.intfin.2021.101303Get rights and content

Highlights

  • We use supervised machine learning to examine whether soft information determines credit risk.

  • We assess how far both bank- and country-level characteristics influence variations in credit risks.

  • Text-based credit risk (soft) measure explains the variation in credit risk measures.

  • Bank-level and country-level characteristics are explaining variations in credit risk measures.

Abstract

This paper uses a supervised machine learning algorithm to extract relevant (soft) information from annual reports and examines whether such information determines credit risk (as measured by non-performing loans, Ohlson’s O-score, Altman’s Z-score, and credit rating downgrades). The paper also assesses how far both bank- and country-level characteristics influence variations in credit risks both within and between banks across 19 European countries between 2005 and 2017. Based on 1885 firm-year observations, we find that the text-based credit risk (soft) measure explains a substantial portion of the variation in NPLs, O-score, Z-score, and credit rating downgrades. We also find that bank-level characteristics and country-level characteristics are highly important for explaining variations in non-performing loans, O-score, and credit rating downgrades, as compared to Z-score. Overall, our results have implications for firms, regulators, and market participants who are seeking evidence on the credibility of annual reports in conveying relevant information that reflects actual credit risk.

Introduction

The current evidence of Bonsall et al., 2017, Donovan et al., 2019 suggests that credit risk presents a signal to market participants by which they may assess the competitive position of a firm and its management capabilities. The European Banking Federation reports that credit risk, as measured by non-performing loans (NPLs) is a problem of the past due to the trajectories of NPLs of European Union (EU) banks reporting massive declines across communities, evidenced by the fall in NPLs over the years. For instance, in 2017 the NPLs of the EU (i.e., 3.7%) fell below the world average of 3.74%.1 Prior research focuses largely on quantitative rather than qualitative (soft) information for estimating credit risk measures, using numbers from financial statements or historical stock price movements (Ali and Daly, 2010, Altman, 1968, Cucinelli et al., 2018, Ghosh, 2015, Ohlson, 1980). These credit risk measures are numerous and include multiple discriminant models, probability of default, logit models, and credit ratings. The heavy reliance on quantitative information over the years for estimating credit risk is perhaps because of the difficulties in capturing and quantifying qualitative (soft) information from corporate disclosures.

Together, prior research on textual analysis provides reliable measures for several features, such as readability (e.g., Li, 2008, Loughran and McDonald, 2014), tone (e.g., Feldman et al., 2010, Loughran and McDonald, 2011, Price et al., 2012), capacity for forward looking (e.g., Li, 2010, Muslu et al., 2015), and firm risk (e.g., Campbell et al., 2014, Elshandidy et al., 2013, Elshandidy et al., 2015). Another strand of the literature shows the application of supervised machine learning algorithms to quantify narrative sections of corporate reports (conference calls) that determine the credit risk (e.g., Donovan et al., 2019) and earnings surprise (e.g., Frankel et al., 2018) of firms. Hitherto, very little research outside the US context has combined quantitative with qualitative information in banks.

Our paper addresses this gap by employing supervised machine learning algorithms to model language in the narrative sections of annual reports of a large-scale sample of European banks in order to explain the observed variations in these banks’ credit risks. Our paper complements that of Donovan et al. (2019), which focuses on conference calls from the USA context and employs credit default swap spread as the credit risk proxy. They conclude that their text-based credit-risk measure explains a substantial portion of borrowers’ credit risk. Our paper, however, differs significantly from Donovan et al. (2019) in terms of the approach and measures. For example, our paper employs data from the EU context, which institutionally yields different results from those that adopt the US regulatory setup which is linked with the Securities and Exchange Commission (SEC). Our paper also employs NPLs to measure credit risk because it has been widely used as a measure of credit risk (Zhang et al., 2016), and is linked to bank failure, presenting itself as a forerunner of bank crises (Ghosh, 2015). Furthermore, our paper models the variations in credit risk that can be explained by our text-based credit risk measure at two different levels, namely, the bank and country levels.

Therefore, we manually collected the annual reports of banks from 19 European countries for the period 2005–2017 to explore the usefulness of credit-risk information (as shown in Section 4). Our results, based on 1885 firm-year observations, show that the text-based credit risk measure is significant and exhibits positive association with NPLs, Ohlson’s O-score, Altman’s Z-score, and credit rating downgrades. This suggests that a significant variation in all credit risk measures across countries within and between banks over the chosen period can be explained by our proposed measure of soft information for credit risk. At country level, we find that financial stress over time have significantly higher explanatory power over all credit risk measures variations. We also find that corruption level explains significant variations in non-performing loans and credit rating downgrades. These results suggest that our text-based credit risk measure contains an economically significant approximation of actual credit risk, and, more importantly, it explains credit risk better than existing credit risk measures. Overall, our evidence supports the argument that banks’ annual reports contain information which relates to credit risk and reflects a bank’s risk exposure. Our findings are robust to a variety of sensitivity tests and alternative credit risk measures.

Our paper contributes to the finance and accounting literature in the following ways. First, we employ corporate disclosure from the EU context. Prior research (e.g., Donovan et al., 2019, Frankel et al., 2018) on the application of machine learning algorithms focuses largely on specific corporate disclosures from the US context (e.g., 10-Ks, MD&A, analyst reports, and conference calls). To our knowledge, our paper is the first to provide direct evidence on the value relevance of annual reports from the EU context on the application of supervised machine learning algorithms. Second, our paper provides guidance on determining credit risk by finding ways to map NPLs to the narrative sections of banks’ annual reports with the application of machine learning algorithms. Our paper is the first to employ a machine learning algorithm which converts the soft content of corporate disclosures into hard content capable of explaining credit risk in a large-scale sample of EU banks. The previous literature focuses principally on quantitative measures to estimate credit risk (e.g., Ali and Daly, 2010, Cucinelli et al., 2018, Ghosh, 2015) However, to our knowledge, no studies have so far mapped language in annual reports to the NPLs of European banks.

Third, unlike earlier literature, we observe variations in credit risks over time, and then associate such variations with changes in text-based credit risk measure. Our paper models the language (soft) in banks’ annual reports over a thirteen-year period to identify credit-risk relevant information more robustly than prior research, since we use Repeated Measures Multilevel Analysis (RMMA) to mitigate the problems caused by nested effects and to account for the variances at various levels over time that are normally ignored in traditional Ordinary Least Squares (OLS) (Robinson, 2003). Finally, the text-based (soft) credit risk measure explains an economically significant portion of the variations in NPLs, and more importantly, it indicates other credit risk measures. Given our results, we suggest that annual reports are useful for identifying the credit risk exposure of firms, thus complementing the work of prior writers (e.g., Campbell et al., 2014, Donovan et al., 2019). Our evidence, as a result, confirms the credibility of the narrative sections of corporate reports, which strengthens the reputation of existing credit risk measures.

Our paper provides important theoretical implications. First, our paper shows the importance of interacting between various levels in cross-country studies where data is nested in hierarchical structures. This answers Elshandidy et al.'s (2018a) call for more research papers to cover the financial firms in a cross-country setting utilizing computer-based assistance in capturing risk information. It further accommodates their suggestion that future research in cross-country settings should adopt multilevel techniques that can capture the hierarchical structure of cross-country data. Our evidence shows to cross-country banking research that it is essential to interact between country- and bank-level variables to understand the variations in credit risk and that this necessitates more future research to develop theoretical frameworks to better understand interactions between these different levels. Furthermore, our results suggest that more attention should be given to variables that may explain the variations in credit risk for banks over time. Our suggestion is consistent with the recent trend (Barth et al., 2013, Beck et al., 2003, Chan and Mohd, 2016) in the literature which explores widely varying governance indicators on bank characteristics (e.g., efficiency, loan quality). Finally, other related research on textual analysis can use our proposed method in quantifying the narrative sections of annual reports (or any similar corporate outlets) with regard to other attributes (e.g., operational risk disclosure). Specifically, our method adds significantly to machine learning algorithms in the financial literature by endorsing the current importance of widening this research scope to give more attention to the application of such algorithms in exploring various firm fundamentals. This is, of particular importance in view of the increased concern regarding whether accounting disclosure benefits market participants or not.

Our paper provides important practical implications to managers, regulators, and investors by contributing to the ongoing discussion on whether banks’ annual reports have informative information content. Managers who know that supervised machine learning algorithms can be applied to their corporate disclosures to provide deeper insights may position themselves effectively to disclose more relevant information, thereby addressing the issue of information asymmetry. Regulators may encourage banking institutions to continue to disclose vital credit risk information. This becomes increasingly important, considering the great reliance of global economies on understanding the complexity and grey areas in the published reports of large firms. Investors are likely to reshape their trading strategies and, if they can, employ supervised algorithms to have more pointed information about firms. In theoretical terms, the high significance of corruption level and financial stress in explaining variations in credit risk across countries implies that any attempt to identify the factors that drive credit risk should take these two country factors into consideration.

The paper proceeds as follows: The next section discusses the institutional background in Europe. Section 3 reviews relevant literature and develops research hypotheses. Section 4 describes the sample selection, data collection, and empirical models. Section 5 presents the empirical results. Section 6 introduces further analysis and robustness checks of our results and finally, Section 7 draws conclusions and suggests avenues for future research.

Section snippets

Institutional background

The European Banking Industry (EBI), over the years, has undergone rigorous rounds of reform through forces such as European integration, technology, and deregulation, to improve the soundness of banking practices and mitigate overall risk exposure (Goddard et al., 2007). Regarding risk, in the records of the EBI, credit risk continues to be ranked as the main risk for most banks. In order to enhance the stability and overall credit risk exposure of the industry, the EU over the years has

Relevant literature on banks’ credit risk and textual credit risk

In their recent review, Elshandidy et al. (2018a) survey 32 papers on risk disclosure that are synthesised based on their primary focus into those papers (16) interested in incentives for risk disclosure (what motivates firms to reveal risk information), papers (12) interested in the informativeness of risk information (whether risk information is value relevant or not), and papers (4) that are interested in studying both incentives and informativeness. After that they identify four areas of

Sample selection and data sources

The criteria and process for our sample selection are described in Table 1 (Panel A). The sample covers the thirteen-year period from 2005 to 2017. We used 2005 as the starting point to take account of the mandatory adoption of IFRS in Europe to ensure the comparability of accounting standards across countries. We obtained annual reports from Thomson One or, when this was not available, from a bank’s website. We focused on annual reports because they remain a major source of information for

Descriptive statistics

Table 2 reports the summary statistics of the variables employed in our analysis. We observe high sample variability in NPLs and bank size. In our sample, the NPL ratio has a mean and standard deviation of 5.72% and 4.78%, respectively. A closer look at specific countries from our sample reveals that the highest NPL ratio (i.e., mean) is in Greece (i.e., 7.71%) and the lowest was in Ireland (0.67%). We observe from Appendix B that the trend in NPLs was lower and stable before the outbreak of

Alternative econometric model (predictive power of NPL_TXT using OLS regression)

We employed traditional OLS regression using our full sample to analyse the impact of NPL_TXT on the credit risk variables without nested effects, as specified in Equation (3). The results are reported in Table 9. Serial correlation was addressed by clustering standard errors at bank level.CRik=β0ik+β1Xblik+β2Xclk+εik

CRik represents the credit risk measures of bank i in country k. Credit risk measures were the same as employed above. β0ik is the intercept, Xblik represents bank-level variables

Conclusion

We employ sLDA to extract relevant (soft) credit risk information from annual reports. We associate variations in credit risk as measured by NPL, OSCORE, ZSCORE, and DOWN with variations in both bank-level characteristics and country level characteristics across 19 EU countries, over the period from 2005 to 2017. We find that our text-based credit risk measure explains a substantial portion of the variation in all credit risk measures. This finding holds after several further analyses. In terms

CRediT authorship contribution statement

Albert Acheampong: Data curation, Writing - original draft, Methodology, Software. Tamer Elshandidy: Conceptualization, Methodology, Software, Validation, Writing - review & editing.

References (106)

  • Z. Bozanic et al.

    Management earnings forecasts and other forward-looking statements

    J. Account. Econ.

    (2018)
  • F. Butaru et al.

    Risk and risk management in the credit card industry

    J. Bank. Financ.

    (2016)
  • R. Cardarelli et al.

    Financial stress and economic contractions

    J. Financ. Stab.

    (2011)
  • M. Chen et al.

    Corruption and bank risk-taking: evidence from emerging economies

    Emerg. Mark. Rev.

    (2015)
  • D. Cucinelli et al.

    Credit risk in European banks: the bright side of the internal ratings based approach

    J. Bank. Financ.

    (2018)
  • S.R. Das et al.

    Accounting-based versus market-based cross-sectional models of CDS spreads

    J. Bank. Financ.

    (2009)
  • T. Elshandidy et al.

    What drives mandatory and voluntary risk reporting variations across Germany, UK and US?

    Br. Account. Rev.

    (2015)
  • T. Elshandidy et al.

    Aggregated, voluntary, and mandatory risk disclosure incentives: evidence from UK FTSE all-share companies

    Int. Rev. Financ. Anal.

    (2013)
  • T. Elshandidy et al.

    Environmental incentives for and usefulness of textual risk reporting: evidence from Germany

    Int. J. Account.

    (2016)
  • T. Elshandidy et al.

    Risk reporting: a review of the literature and implications for future research

    J. Account. Lit.

    (2018)
  • F. Fiordelisi et al.

    Efficiency and risk in European banking

    J. Bank. Financ.

    (2011)
  • T. García-Marco et al.

    Risk-taking behaviour and ownership in the banking industry: the Spanish evidence

    J. Econ. Bus.

    (2008)
  • A. Ghosh

    Banking-industry specific and regional economic determinants of non-performing loans: Evidence from US states

    J. Financ. Stab.

    (2015)
  • J. Goddard et al.

    European banking: an overview

    J. Bank. Financ.

    (2007)
  • R. Jankowitsch et al.

    Modelling the economic value of credit rating systems

    J. Bank. Financ.

    (2007)
  • S. Kalemli-Ozcan et al.

    Leverage across firms, banks, and countries

    J. Int. Econ.

    (2012)
  • F. Li

    Annual report readability, current earnings, and earnings persistence

    J. Account. Econ.

    (2008)
  • J. Li et al.

    Reactions of Japanese markets to changes in credit ratings by global and local agencies

    J. Bank. Financ.

    (2006)
  • D.P. Louzis et al.

    Macroeconomic and bank-specific determinants of non-performing loans in Greece: a comparative study of mortgage, business and consumer loan portfolios

    J. Bank. Financ.

    (2012)
  • G. Marcato et al.

    Market integration, country institutions and IPO underpricing

    J. Corp. Financ.

    (2018)
  • A. Miihkinen

    The usefulness of firm risk disclosures under different firm riskiness, investor-interest, and market conditions: new evidence from Finland

    Adv. Account.

    (2013)
  • M. Nguyen et al.

    Market power, revenue diversification and bank stability: evidence from selected South Asian countries

    J. Int. Financ. Mark. Inst. Money.

    (2012)
  • J. Park

    Corruption, soundness of the banking sector, and economic growth: a cross-country study

    J. Int. Money Financ.

    (2012)
  • R. Price et al.

    The impact of governance reform on performance and transparency

    J. Financ. Econ.

    (2011)
  • S.M.K. Price et al.

    Earnings conference calls and stock returns: the incremental informativeness of textual tone

    J. Bank. Financ.

    (2012)
  • M.R. Roberts et al.

    Endogeneity in Empirical Corporate Finance

    Handb. Econ. Finance.

    (2013)
  • A.F. Adam-Müller et al.

    Risk disclosure noncompliance

    J. Account. Pub. Pol.

    (2020)
  • M. Abbdullah et al.

    Risk management disclosure: a study on the effect of voluntary risk management disclosure toward firm value

    J. Appl. Account. Res.

    (2015)
  • M. Agostino et al.

    The value relevance of IFRS in the European banking industry

    Rev. Quant. Financ. Account.

    (2011)
  • L.S. Aiken et al.

    Multiple Regression: Testing and Interpreting Interactions

    (1991)
  • A. Allini et al.

    The board’s role in risk disclosure: an exploratory study of Italian listed state-owned enterprises

    Public Money Manag.

    (2016)
  • E.I. Altman

    Financial ratios, discriminant analysis and the prediction of corporate bankruptcy

    J. Finance

    (1968)
  • Altman, E.I., Hartzell, J., Peck, M., 1998. Emerging market corporate bonds: a scoring system. pp....
  • S. Aziz et al.

    Machine learning in finance: a topic modeling approach

    SSRN Electron. J.

    (2019)
  • R. Balakrishnan et al.

    The transmission of financial stress from advanced to emerging economies

    Emerg. Markets Finance Trade.

    (2011)
  • Y. Bao et al.

    Simultaneously discovering and quantifying risk types from textual risk disclosures

    Manage. Sci.

    (2014)
  • R.M. Baron et al.

    The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations

    J. Pers. Soc. Psychol.

    (1986)
  • BCBS

    Revised Pillar 3 Disclosure Requirements

    (2015)
  • P.D. Bliese et al.

    Being both too liberal and too conservative: The perils of treating grouped data as though they were independent

    Organ. Res. Methods.

    (2004)
  • D. Blei et al.

    A topic model for word sense disambiguation, in

  • Cited by (24)

    • The value of family social capital in informal financial markets: Evidence from China

      2023, Pacific Basin Finance Journal
      Citation Excerpt :

      The main functions of financial institutions are risk assessment and control (Garmaise, 2015). Identifying factors or cues that can alleviate information asymmetry between counterparties has received special attention from academia (e.g., Petersen and Rajan, 2002; Norden and Weber, 2010; Garmaise, 2015; Fisman et al., 2017; Acheampong and Elshandidy, 2021). According to previous studies, social capital has been documented to have significant effects on corporate financing activities, such as reducing the cost of bank debt (Engelberg et al., 2012; Hasan et al., 2017), facilitating access to trade credit (Liu et al., 2016), reducing the cost of equity (Ferris et al., 2017a), influencing capital structure (Huang and Shang, 2019; Dudley, 2021), and increasing corporate financing efficiency (Yin et al., 2022).

    View all citing articles on Scopus

    *We thank Jonathan A. Batten (the Editor) and the anonymous referee for constructive and helpful comments. This paper has benefited from comments and suggestions from participants at Bradford University Management School Conference (June 2019). We thank Mahdi Mousavi, Chengang Wang, and Steven Wu for their helpful suggestions.

    View full text