A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection,Expert Systems with Applications

当前位置： X-MOL 学术 › Expert Syst. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection
Expert Systems with Applications ( IF 7.5 ) Pub Date : 2021-02-25 , DOI: 10.1016/j.eswa.2021.114750
Zhenchuan Li , Mian Huang , Guanjun Liu , Changjun Jiang

Class imbalance with overlap is a very challenging problem in electronic fraud transaction detection. Fraudsters have racked their brains to make a fraud transaction as similar as a genuine one in order to avoid being found. Therefore, lots of data of fraud transactions overlap with genuine transactions so that it is hard to distinguish them. However, most attention has been focused on class imbalance rather than overlapping issues for machine-learning-based methods of fraud transaction detection. This paper proposes a novel hybrid method to handle the problem of class imbalance with overlap based on a divide-and-conquer idea. Firstly, an anomaly detection model is trained on the minority samples for excluding both a few outliers of minority class and lots of majority samples from the original dataset. Then the remaining samples form an overlapping subset that has a low imbalance ratio and a reduced learning interference from both minority class and majority class than the original dataset. After that, this difficult overlapping subset is dealt with a non-linear classifier in order to distinguish them well. To achieve good properties of the overlapping subset, we propose a novel assessment criterion, Dynamic Weighted Entropy (DWE), to evaluate its quality. It is a specially designed trade-off between the number of excluded outliers of minority class and the ratio of class imbalance of overlapping subset. With the help of DWE, time consumption on searching good hyper-parameters is dramatically declined. Extensive experiments on Kaggle fraud detection dataset and a large real electronic transaction dataset demonstrate that our method significantly outperforms state-of-the-art ones.

中文翻译：

一种动态加权熵的混合方法，用于处理信用卡欺诈检测中重叠的类不平衡问题

在电子欺诈交易检测中，具有重叠的类不平衡是一个非常具有挑战性的问题。为了避免被发现，欺诈者已经绞尽脑汁进行与真实交易相似的欺诈交易。因此，欺诈交易的许多数据与真实交易重叠，因此很难区分它们。但是，对于基于机器学习的欺诈交易检测方法，大多数注意力都集中在类不平衡上，而不是重叠的问题上。本文提出了一种基于分而治之思想的混合方法来解决类重叠不平衡的问题。首先，针对少数样本训练异常检测模型，以从原始数据集中排除少数少数类别的异常值和多数少数样本。然后，其余样本形成一个重叠子集，与原始数据集相比，该子集具有较低的失衡比，并且减少了来自少数类别和多数类别的学习干扰。此后，使用非线性分类器处理此困难的重叠子集，以便很好地区分它们。为了获得重叠子集的良好属性，我们提出了一种新颖的评估标准动态加权熵（DWE），以评估其质量。这是经过特殊设计的折衷方案，可以在少数群体中排除的异常值数量与重叠子集的群体不平衡比率之间进行权衡。在DWE的帮助下，搜索良好的超参数所花费的时间大大减少了。

更新日期：2021-03-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11