当前位置: X-MOL 学术Electronics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Empirical Analysis of Rank Aggregation-Based Multi-Filter Feature Selection Methods in Software Defect Prediction
Electronics ( IF 2.9 ) Pub Date : 2021-01-15 , DOI: 10.3390/electronics10020179
Abdullateef O. Balogun , Shuib Basri , Saipunidzam Mahamad , Said Jadid Abdulkadir , Luiz Fernando Capretz , Abdullahi A. Imam , Malek A. Almomani , Victor E. Adeyemo , Ganesh Kumar

Selecting the most suitable filter method that will produce a subset of features with the best performance remains an open problem that is known as filter rank selection problem. A viable solution to this problem is to independently apply a mixture of filter methods and evaluate the results. This study proposes novel rank aggregation-based multi-filter feature selection (FS) methods to address high dimensionality and filter rank selection problem in software defect prediction (SDP). The proposed methods combine rank lists generated by individual filter methods using rank aggregation mechanisms into a single aggregated rank list. The proposed methods aim to resolve the filter selection problem by using multiple filter methods of diverse computational characteristics to produce a dis-joint and complete feature rank list superior to individual filter rank methods. The effectiveness of the proposed method was evaluated with Decision Tree (DT) and Naïve Bayes (NB) models on defect datasets from NASA repository. From the experimental results, the proposed methods had a superior impact (positive) on prediction performances of NB and DT models than other experimented FS methods. This makes the combination of filter rank methods a viable solution to filter rank selection problem and enhancement of prediction models in SDP.

中文翻译:

基于秩聚合的多过滤器特征选择方法在软件缺陷预测中的实证分析

选择将产生具有最佳性能的特征子集的最合适的滤波方法仍然是一个未解决的问题,称为滤波等级选择问题。解决此问题的可行方法是独立应用多种过滤方法并评估结果。这项研究提出了一种新的基于秩聚合的多过滤器特征选择(FS)方法,以解决软件缺陷预测(SDP)中的高维和过滤器秩选择问题。所提出的方法将通过使用等级聚合机制的各个过滤器方法生成的等级列表组合成单个聚合的等级列表。所提出的方法旨在通过使用具有多种计算特性的多种滤波器方法来产生优于单个滤波器等级方法的不连贯且完整的特征等级列表,从而解决滤波器选择问题。利用决策树(DT)和朴素贝叶斯(NB)模型对NASA储存库中的缺陷数据集评估了该方法的有效性。从实验结果来看,所提出的方法对NB和DT模型的预测性能具有比其他实验FS方法更好的影响(正面)。这使得过滤器秩方法的组合成为解决过滤器秩选择问题和增强SDP中的预测模型的可行解决方案。从实验结果来看,所提出的方法对NB和DT模型的预测性能具有比其他实验FS方法更好的影响(正面)。这使得过滤器秩方法的组合成为解决过滤器秩选择问题和增强SDP中的预测模型的可行解决方案。从实验结果来看,所提出的方法对NB和DT模型的预测性能具有比其他实验FS方法更好的影响(正面)。这使得过滤器秩方法的组合成为解决过滤器秩选择问题和增强SDP中的预测模型的可行解决方案。
更新日期:2021-01-15
down
wechat
bug