当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Statistical Analysis of the Performance of Rank Fusion Methods Applied to a Homogeneous Ensemble Feature Ranking
Scientific Programming Pub Date : 2020-09-10 , DOI: 10.1155/2020/8860044
Majid Soheili 1 , Amir-Masoud Eftekhari Moghadam 1 , Mehdi Dehghan 2
Affiliation  

The feature ranking as a subcategory of the feature selection is an essential preprocessing technique that ranks all features of a dataset such that many important features denote a lot of information. The ensemble learning has two advantages. First, it has been based on the assumption that combining different model’s output can lead to a better outcome than the output of any individual models. Second, scalability is an intrinsic characteristic that is so crucial in coping with a large scale dataset. In this paper, a homogeneous ensemble feature ranking algorithm is considered, and the nine rank fusion methods used in this algorithm are analyzed comparatively. The experimental studies are performed on real six medium datasets, and the area under the feature-forward-addition curve criterion is assessed. Finally, the statistical analysis by repeated-measures analysis of variance results reveals that there is no big difference in the performance of the rank fusion methods applied in a homogeneous ensemble feature ranking; however, this difference is a statistical significance, and the B-Min method has a little better performance.

中文翻译:

应用于同构集成特征排序的秩融合方法性能的统计分析

特征排序作为特征选择的一个子类别是一种必不可少的预处理技术,它对数据集的所有特征进行排序,使得许多重要特征表示大量信息。集成学习有两个优点。首先,它基于这样一个假设,即组合不同模型的输出可以产生比任何单个模型的输出更好的结果。其次,可扩展性是一个内在特征,在处理大规模数据集时非常重要。本文考虑了一种同构集成特征排序算法,并对该算法中使用的九种排序融合方法进行了比较分析。实验研究是在真实的六个介质数据集上进行的,并评估了特征前加曲线标准下的面积。最后,方差结果的重复测量分析的统计分析表明,应用于同构集成特征排序的排序融合方法的性能没有太大差异;但是,这种差异是有统计意义的,B-Min 方法的性能要好一些。
更新日期:2020-09-10
down
wechat
bug