A machine learning approach for classification of equivalent mutants,Journal of Software: Evolution and Process

当前位置： X-MOL 学术 › J. Softw. Evol. Process › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A machine learning approach for classification of equivalent mutants
Journal of Software: Evolution and Process ( IF 1.7 ) Pub Date : 2019-12-03 , DOI: 10.1002/smr.2238
Muhammad Rashid Naeem ₁ , Tao Lin ₁ , Hamad Naeem ₁ , Hailu Liu ₁

Affiliation

Mutation testing is a fault‐based technique to test the quality of test suites by inducing artificial syntactic faults or mutants in a source program. However, some mutants have the same semantics as original program and cannot be detected by any test suite input known as equivalent mutants. Equivalent mutant problem (EMP) is undecidable as it requires manual human effort to identify a mutant as equivalent or killable. The constraint‐based testing (CBT) theory suggests the use of mathematical constraints which can help reveal some equivalent mutants using mutant features. In this paper, we consider three metrics of CBT theory, ie, reachability, necessity, and sufficiency to extract feature constraints from mutant programs. Constraints are extracted using program dependency graphs. Other features such as degree of significance, semantic distance, and information entropy of mutants are also extracted to build a binary classification model. Machine learning algorithms such as Random Forest, GBT, and SVM are applied under two application scenarios (split‐project and cross‐project) on ten Java programs to predict equivalent mutants. The analysis of the study demonstrates that that the proposed techniques not only improves the efficiency of the equivalent mutant detection but also reduces the effort required to perform it with small accuracy loss.

中文翻译：

一种用于等价突变体分类的机器学习方法

变异测试是一种基于故障的技术，通过在源程序中引入人工语法错误或突变来测试测试套件的质量。但是，某些突变体具有与原始程序相同的语义，并且无法被任何称为等效突变体的测试套件输入检测到。等效突变体问题 (EMP) 是不可判定的，因为它需要人工识别突变体是否等效或可杀死。基于约束的测试 (CBT) 理论建议使用数学约束，这可以帮助揭示一些使用突变特征的等效突变体。在本文中，我们考虑了 CBT 理论的三个度量，即从突变程序中提取特征约束的可达性、必要性和充分性。使用程序依赖图提取约束。其他特征，如重要程度、语义距离、并提取突变体的信息熵以构建二元分类模型。随机森林、GBT 和 SVM 等机器学习算法在两个应用场景（拆分项目和跨项目）下应用于十个 Java 程序，以预测等效的突变体。研究分析表明，所提出的技术不仅提高了等效突变体检测的效率，而且还减少了以较小的精度损失执行它所需的工作量。

更新日期：2019-12-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文