Entropy and Confidence-Based Undersampling Boosting Random Forests for Imbalanced Problems.,IEEE Transactions on Neural Networks and Learning Systems

当前位置： X-MOL 学术 › IEEE Trans. Neural Netw. Learn. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Entropy and Confidence-Based Undersampling Boosting Random Forests for Imbalanced Problems.
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2020-01-24 , DOI: 10.1109/tnnls.2020.2964585
Zhe Wang , Chenjie Cao , Yujin Zhu

In this article, we propose a novel entropy and confidence-based undersampling boosting (ECUBoost) framework to solve imbalanced problems. The boosting-based ensemble is combined with a new undersampling method to improve the generalization performance. To avoid losing informative samples during the data preprocessing of the boosting-based ensemble, both confidence and entropy are used in ECUBoost as benchmarks to ensure the validity and structural distribution of the majority samples during the undersampling. Furthermore, different from other iterative dynamic resampling methods, ECUBoost based on confidence can be applied to algorithms without iterations such as decision trees. Meanwhile, random forests are used as base classifiers in ECUBoost. Furthermore, experimental results on both artificial data sets and KEEL data sets prove the effectiveness of the proposed method.

中文翻译：

基于熵和置信度的欠采样增强了不平衡问题的随机森林。

在本文中，我们提出了一种新颖的基于熵和置信度的欠采样增强（ECUBoost）框架来解决不平衡问题。基于Boosting的集合与新的欠采样方法相结合，以提高泛化性能。为了避免在基于增强的集合的数据预处理期间丢失信息量大的样本，在ECUBoost中将置信度和熵都用作基准，以确保在欠采样期间多数样本的有效性和结构分布。此外，与其他迭代动态重采样方法不同，基于置信度的ECUBoost可以应用于没有迭代的算法，例如决策树。同时，随机森林被用作ECUBoost中的基本分类器。此外，

更新日期：2020-01-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11