Correlation-based Oversampling aided Cost Sensitive Ensemble learning technique for Treatment of Class Imbalance,Journal of Experimental & Theoretical Artificial Intelligence

当前位置： X-MOL 学术 › J. Exp. Theor. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Correlation-based Oversampling aided Cost Sensitive Ensemble learning technique for Treatment of Class Imbalance
Journal of Experimental & Theoretical Artificial Intelligence ( IF 2.2 ) Pub Date : 2021-01-13
Debashree Devi, Saroj K. Biswas, Biswajit Purkayastha

ABSTRACT

The issue of class imbalance and its consequences over the conventional learning models is a well-investigated topic, as it highly influences performances of real-life classification tasks. Amongst the available solutions, Synthetic Minority Oversampling Technique (SMOTE) imprints efficacy in balancing the data through synthetic minority instance generation. However, SMOTE suffers from the drawback of redundant data generation owing to uniform oversampling rate in regard to which, SMOTE with a customised oversampling rate has been investigated recently. In parallel to this, ensemble learning approaches are quite effective in improving prediction abilities of a set of weak classifiers through adaptive-weighted training. However, it does not account the imbalanced nature of the data during training. Through this paper, Correlation-based Oversampling aided Cost Sensitive Ensemble learning (CorrOV-CSEn) is proposed by integrating correlation-based oversampling with the AdaBoost ensemble learning model. The correlation-based oversampling entails to define a customised oversampling rate and a suitable oversampling zone while a misclassification ratio-based cost-function is introduced in the AdaBoost model to administer adaptive learning of imbalanced cases. CorrOV-CSEn is evaluated against 13 state-of-the-art methods by using 8 simulation datasets. The experimental results establish CorrOV-CSEn to be effective than the state-of-the-art methods in resolving the concerned issues.

中文翻译：

基于相关的过采样辅助成本敏感组合学习技术

摘要

班级不平衡问题及其对传统学习模型的影响是一个经过充分研究的话题，因为它极大地影响了现实生活中的分类任务的表现。在可用的解决方案中，合成少数族裔过采样技术（SMOTE）可以通过合成少数族裔实例生成来平衡数据。然而，由于均匀的过采样率，SMOTE具有产生冗余数据的缺点，对此，最近已经研究了具有定制的过采样率的SMOTE。与此同时，集成学习方法通过自适应加权训练在提高一组弱分类器的预测能力方面非常有效。但是，它不能解决训练过程中数据的不平衡性。通过本文，通过将基于相关的过采样与AdaBoost集成学习模型相集成，提出了基于相关的过采样辅助成本敏感集成学习（CorrOV-CSEn）。基于相关的过采样要求定义自定义的过采样率和合适的过采样区域，同时在AdaBoost模型中引入基于误分类比率的成本函数以管理不平衡案例的自适应学习。通过使用8个模拟数据集，针对13种最新方法对CorrOV-CSEn进行了评估。实验结果证明，CorrOV-CSEn在解决相关问题方面比最新方法有效。基于相关的过采样要求定义自定义的过采样率和合适的过采样区域，同时在AdaBoost模型中引入基于误分类比率的成本函数以管理不平衡案例的自适应学习。通过使用8个模拟数据集，针对13种最新方法对CorrOV-CSEn进行了评估。实验结果证明，CorrOV-CSEn在解决相关问题方面比最新方法有效。基于相关的过采样要求定义自定义的过采样率和合适的过采样区域，同时在AdaBoost模型中引入基于误分类比率的成本函数以管理不平衡案例的自适应学习。通过使用8个模拟数据集，针对13种最新方法对CorrOV-CSEn进行了评估。实验结果证明，CorrOV-CSEn在解决相关问题方面比最新方法有效。

更新日期：2021-01-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>