当前位置: X-MOL 学术IEEE Trans. Reliab. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hellinger Net: A Hybrid Imbalance Learning Model to Improve Software Defect Prediction
IEEE Transactions on Reliability ( IF 5.0 ) Pub Date : 2020-09-10 , DOI: 10.1109/tr.2020.3020238
Tanujit Chakraborty , Ashis Kumar Chakraborty

Software defect prediction (SDP) is a convenient way to identify defects in the early phases of the software development life cycle. This early warning system can help in the removal of software defects and yield a cost-effective and good quality of software products. A wide range of statistical and machine learning models have been employed to predict defects in software modules. But the imbalanced nature of this type of SDP datasets is pivotal for the successful development of a defect prediction model. Imbalanced software datasets contain nonuniform class distributions with a few instances belonging to a specific class compared to that of the other class. This article proposes a novel hybrid methodology, namely the Hellinger net model, for imbalanced learning to improve defect prediction for software modules. Hellinger net, a tree to network mapped model, is a deep feedforward neural network with a built-in hierarchy, just like decision trees. Hellinger net also utilizes the strength of a skew insensitive distance measure, namely Hellinger distance, in handling class imbalance problems. On the theoretical side, this article proves the theoretical consistency of the proposed model. A thorough experiment was conducted over ten NASA SDP datasets to show the superiority of the proposed method.

中文翻译:


Hellinger Net:一种改进软件缺陷预测的混合不平衡学习模型



软件缺陷预测(SDP)是在软件开发生命周期的早期阶段识别缺陷的便捷方法。这种早期预警系统可以帮助消除软件缺陷,并产生具有成本效益和良好质量的软件产品。广泛的统计和机器学习模型已被用来预测软件模块中的缺陷。但此类 SDP 数据集的不平衡性质对于缺陷预测模型的成功开发至关重要。不平衡的软件数据集包含不均匀的类分布,与其他类的实例相比,少数实例属于特定类。本文提出了一种新颖的混合方法,即 Hellinger 网络模型,用于不平衡学习,以改进软件模块的缺陷预测。 Hellinger 网络是一种树到网络映射模型,是一种具有内置层次结构的深度前馈神经网络,就像决策树一样。 Hellinger 网络还利用倾斜不敏感距离度量(即 Hellinger 距离)的优势来处理类别不平衡问题。在理论方面,本文证明了所提出模型的理论一致性。对十多个 NASA SDP 数据集进行了彻底的实验,以证明所提出方法的优越性。
更新日期:2020-09-10
down
wechat
bug