MetaRisk: Semi-supervised few-shot operational risk classification in banking industry
Introduction
In the decade since the global financial crisis, banks and regulators have become increasingly alert to operational risks. However, the banks and regulators still struggle to deal with operational risk effectively [20], [15]. It is reported that major banks global wide have suffered nearly $210 billion in operational risk losses since 2011.1 Operational risk, according to Basel Accord II,2 is defined as the risks of loss due to errors, breaches, interruptions or damages caused by people, internal processes, systems or external events. In the banking industry, one of the daily jobs of risk officers is to screen potential risks with a large number of online news outlets and to assess any news events that may expose risks to the bank’s operations. Therefore, it is of keen interest of financial organizations to develop effective machine learning methods for operational risk classification.
While this task can be easily formulated as a standard document classification problem, there are at least two challenges in designing an effective operational risk prediction system. First, labeling financial news into different risk types requires substantial domain knowledge, and thus it is impossible to build a dataset using crowd-labeling services. Moreover, in practice, different banks are exposed to and are potentially vulnerable to different types of operational risks so no public dataset is available. As a result, there are only a few labeled news articles that are manually labeled by risk officers, while a large number of news articles remain unlabeled. Second, a small portion of financial news is related to multiple risk types which makes the problem essentially a multi-label classification task. For example, Internal Fraud (such as Bribery and Corruption) and Clients and Market Practice (such as Money Laundering) are two types of operational risks. Actually, bribery and corruption are intrinsically linked with money laundering. Not only do they tend to occur together, but also the presence of one tends to reinforce the other. Therefore, a news article that has an Internal Fraud label is likely to be labeled with Clients and Market Practice labels. However, those multi-label instances consist of a small portion of the entire dataset and thus some may not even appear in the training set. As a result, a standard multi-label discriminative classification model often performs suboptimally. It is desirable that the classifier can generalize well to those rare multi-label instances and alert risk officers such “black swan” events.
To tackle the aforementioned practice problems, we re-frame multi-label supervised operational risk classification as a semi-supervised few-shot learning problem. We do so for two reasons. First, semi-supervised learning [6] leverages unlabeled data to learn better data distribution which helps the discriminative model. Second, few-shot learning [36] is expected to learn generalizable classifier and thus may accommodate new multi-label classes that are not frequently seen in the training set. These two learning paradigms are largely independently studied in prior research, with most work addressing one or the other. Recently, a few studies [35], [28] propose semi-supervised few-shot learning framework for multi-class image recognition, while some researchers focus on few-shot text classifications [37], [14]. However, these methods are not applicable for our operational risk context as we face a multi-label classification task where instances are usually associated with more than one label [50].
In this work, we propose MetaRisk, a novel multi-label semi-supervised few-shot learning model for operational risk classification. Our method is built on the prototypical network [36] but improves the prototypes of risk combinations (multi-risk classes) by adjusting the weight of each risk type for each instance. Specifically, MetaRisk first utilizes a weighted scheme to learn the prior knowledge for risk class combinations from the relevant individual risk class. It then builds and refines a prototypical network to learn the single label and multi-label prototypes. We adopt attention mechanism [4] from neural network training to calculate the weighted prototype vector for multi-label risk type combinations. Furthermore, a soft-masking mechanism is introduced to refine the prototypes using unlabeled data, which allows our model to obtain decision boundaries for better fitting underlying risk distribution. We empirically evaluate MetaRisk on a proprietary dataset collected by an international banking organization. Experiment results show that MetaRisk outperforms a set of standard baselines. In particular, it is more effective than baselines on recognizing new risk type combinations task with a small number of known labeled and a large number of unlabeled instances.
Our main contributions can be summarized as two-folds.
- •
First, to the best of our knowledge, we are the first to study the operational risk classification problem using semi-supervised meta-learning method. We identify two practical challenges associated with operational risk classification and frame the problem using semi-supervised few-shot learning framework with a weighted scheme. We further modify the framework so that it can be generalized to minority multi-label risk classes.
- •
Second, we evaluate the framework on a real-world dataset and demonstrate its effectiveness. The system prototype has been used internally by the bank’s risk management team. We hope this work provides key insight into designing the practical semi-supervised meta-learning model for important financial applications.
The rest of the paper is arranged as follows. We first review related literature and position our work in that context in Section 2. The formal problem definition, as well as the necessary background with respect to operational risk classification and meta-learning techniques, are introduced in Section 3. The details of our MetaRisk model are presented in Section 4. Comprehensive experimental results demonstrating the superiority of our model are presented in Section 5. We conclude this work and point out the future directions in Section 6.
Section snippets
Related work
We now review the relevant literature from three basic perspectives and position our work in that context.
Preliminaries
In this section, we present the dataset and formally define the problem. We also describe the semi-supervised setting and meta-learning paradigm, as well as the necessary backgrounds of the operational risk classification problem. In Table 1, we summarize the frequently used notations in this paper.
Main methodologies
In this section, we present the few-shot risk prediction framework, MetaRisk. The overall architecture of our proposed MetaRisk is shown in Fig. 1. The high-level workflow is as follows.
We first construct the support sets and query sets using a modified episode (task) paradigm and turn the learning task into few-shot learning. All training financial articles are then encoded into an embedding space by using standard Bi-LSTM and self-attention mechanisms as our document encoding component. We
Experimental observations
In this section, we evaluate our proposed methods on a real-world dataset. We start by covering baselines, followed by results and discussions.
Conclusions and future work
Financial Technology (FinTech) is transforming the financial service industry by providing new services, controlling costs and supporting profitability. In the banking industry, using big data analytics and machine learning to identify potential operational risks has attracted executives and managers’ attention from a practical perspective. Due to the nature of the financial service industry, obtaining high-quality labeled data is usually costly. Moreover, it is desirable that the intelligent
CRediT authorship contribution statement
Fan Zhou: Conceptualization, Methodology, Data curation, Writing - original draft. Xiuxiu Qi: Software, Validation, Investigation. Chunjing Xiao: Conceptualization, Methodology, Resources. Jiahao Wang: Resources, Visualization, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by National Natural Science Foundation of China (Grant Nos. 62072077, 61602097 and 61402151).
References (50)
- et al.
Multi-label semi-supervised classification through optimum-path forest
Information Sciences
(2018) - et al.
Learning multi-label scene classification
Pattern Recognition
(2004) - et al.
Automatic detection of relationships between banking operations using machine learning
Information Sciences
(2019) Bankruptcy prediction using imaged financial ratios and convolutional neural networks
Expert Systems with Applications
(2019)- et al.
The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature
Decision Support Systems
(2011) - et al.
Leveraging deep learning with lda-based text analytics to detect automobile insurance fraud
Decision Support Systems
(2018) - et al.
ML-KNN: A lazy learning approach to multi-label learning
Pattern Recognition
(2007) - A. Adhikari, A. Ram, R. Tang, J. Lin, Rethinking complex neural network architectures for document classification, in:...
- A. Ayyad, N. Navab, M. Elhoseiny, S. Albarqouni, Semi-supervised few-shot learning with local and global consistency,...
- D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, in:...
Semi-supervised learning
IEEE Transactions on Neural Networks
Collaboration based multi-label learning
Induction networks for few-shot text classification
Long short-term memory
Neural Computation
Managing Operational Risk: 20 Firmwide Best Practice Strategies
Cited by (11)
Semi-supervised imbalanced multi-label classification with label propagation
2024, Pattern RecognitionEnsembling Multi-View Discriminative Semantic Feature for Few-Shot Classification
2024, Engineering Applications of Artificial IntelligenceSTID-Prompt: Prompt learning for sentiment-topic-importance detection in financial news
2024, Knowledge-Based SystemsCoarse-to-fine few-shot classification with deep metric learning
2022, Information SciencesCitation Excerpt :This inspires researchers to develop various FSC models [16,29,12]. Essentially, they predict the labels for unseen samples using only a few labeled samples, and have found widespread applications, such as semantic segmentation [21], multi-label node classification [46], and operational risk classification [50]. In few-shot classification, the data set consists of training set, i.e., rich labeled samples in source domain, support set, i.e., very limited labeled samples in target domain, and query set, i.e., the unseen samples in target domain.
A text analysis of operational risk loss descriptions
2023, Journal of Operational RiskTwin prototype networks with noisy label self-correction for fault diagnosis of wind turbine gearboxes
2023, Measurement Science and Technology