当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Software defect prediction based on correlation weighted class association rule mining
Knowledge-Based Systems ( IF 7.2 ) Pub Date : 2020-03-23 , DOI: 10.1016/j.knosys.2020.105742
Yuanxun Shao , Bin Liu , Shihai Wang , Guoqi Li

Software defect prediction based on supervised learning plays a crucial role in guiding software testing for resource allocation. In particular, it is worth noticing that using associative classification with high accuracy and comprehensibility can predict defects. But owing to the imbalance data distribution inherent, it is easy to generate a large number of non-defective class association rules, but the defective class association rules are easily ignored. Furthermore, classical associative classification algorithms mainly measure the interestingness of rules by the occurrence frequency, such as support and confidence, without considering the importance of features, resulting in combinations of the insignificant frequent itemset. This promotes the generation of weighted associative classification. However, the feature weighting based on domain knowledge is subjective and unsuitable for a high dimensional dataset. Hence, we present a novel software defect prediction model based on correlation weighted class association rule mining (CWCAR). It leverages a multi-weighted supports-based framework rather than the traditional support-confidence approach to handle class imbalance and utilizes the correlation-based heuristic approach to assign feature weight. Besides, we also optimize the ranking, pruning and prediction stages based on weighted support. Results show that CWCAR is significantly superior to state-of-the-art classifiers in terms of Balance, MCC, and Gmean.



中文翻译:

基于相关加权类关联规则挖掘的软件缺陷预测

基于监督学习的软件缺陷预测在指导软件测试资源分配方面起着至关重要的作用。尤其值得注意的是,使用具有高精度和可理解性的关联分类可以预测缺陷。但是由于固有的不平衡数据分布,很容易生成大量的无缺陷的类关联规则,但是有缺陷的类关联规则很容易被忽略。此外,经典的关联分类算法主要通过不考虑特征的重要性而通过出现频率(例如支持和置信度)来衡量规则的趣味性,从而导致无关紧要的频繁项集的组合。这促进了加权关联分类的生成。然而,基于领域知识的特征加权是主观的,不适用于高维数据集。因此,我们提出了一种基于相关加权类关联规则挖掘(CWCAR)的新型软件缺陷预测模型。它利用基于支持的多重加权框架而不是传统的支持-置信度方法来处理类不平衡,并利用基于相关的启发式方法来分配特征权重。此外,我们还基于加权支持来优化排名,修剪和预测阶段。结果表明,CWCAR在以下方面明显优于最新的分类器:它利用基于支持的多重加权框架而不是传统的支持-置信度方法来处理类不平衡,并利用基于相关的启发式方法来分配特征权重。此外,我们还基于加权支持来优化排名,修剪和预测阶段。结果表明,CWCAR在以下方面明显优于最新的分类器:它利用基于支持的多重加权框架而不是传统的支持-置信度方法来处理类不平衡,并利用基于相关的启发式方法来分配特征权重。此外,我们还基于加权支持来优化排名,修剪和预测阶段。结果表明,CWCAR在以下方面明显优于最新的分类器:一种一种ñCË中号CCGË一种ñ

更新日期:2020-03-24
down
wechat
bug