当前位置: X-MOL 学术J. Intell. Fuzzy Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Decision tree classification algorithm for non-equilibrium data set based on random forests
Journal of Intelligent & Fuzzy Systems ( IF 1.7 ) Pub Date : 2020-06-29 , DOI: 10.3233/jifs-179937
Peng Wang 1 , Ningchao Zhang 1
Affiliation  

In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoisenon-equilibrium data, and SNM algorithm and RFID are combined to remove redundant data from data sets. Based on the results of data processing, the non-equilibrium data sets are classified by random forest method. According to Bootstrap resampling method with certain constraints, the majority and minority samples of each sample subset are sampled, CART is used to train the data set, and a decision tree is constructed. Obtain the final classification results by voting on the CART decision tree classification. Experimental results show that the proposed algorithm has the characteristics of high classification accuracy and low complexity, and it is a feasible classification algorithm for non-equilibrium data set.

中文翻译:

基于随机森林的非均衡数据集决策树分类算法

为了解决当前非平衡数据集分类算法精度低,复杂度高的问题,提出了一种基于随机森林的非平衡数据集决策树分类算法。小波包分解用于降噪非平衡数据,并且SNM算法和RFID相结合以从数据集中删除冗余数据。根据数据处理的结果,通过随机森林法对非平衡数据集进行分类。根据具有一定约束的Bootstrap重采样方法,对每个样本子集的多数样本和少数样本进行采样,使用CART训练数据集,并构建决策树。通过对CART决策树分类进行投票来获得最终分类结果。
更新日期:2020-06-30
down
wechat
bug