当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Principal component based support vector machine (PC-SVM): a hybrid technique for software defect detection
Cluster Computing ( IF 3.6 ) Pub Date : 2021-04-16 , DOI: 10.1007/s10586-021-03282-8
Mohd Mustaqeem 1 , Mohd Saqib 2
Affiliation  

Defects are the major problems in the current situation and predicting them is also a difficult task. Researchers and scientists have developed many software defects prediction techniques to overcome this very helpful issue. But to some extend there is a need for an algorithm/method to predict defects with more accuracy, reduce time and space complexities. All the previous research conducted on the data without feature reduction lead to the curse of dimensionality. We brought up a machine learning hybrid approach by combining Principal component Analysis (PCA) and Support vector machines (SVM) to overcome the ongoing problem. We have employed PROMISE (CM1: 344 observations, KC1: 2109 observations) data from the directory of NASA to conduct our research. We split the dataset into training (CM1: 240 observations, KC1: 1476 observations) dataset and testing (CM1: 104 observations, KC1: 633 observations) datasets. Using PCA, we find the principal components for feature optimization which reduce the time complexity. Then, we applied SVM for classification due to very native qualities over traditional and conventional methods. We also employed the GridSearchCV method for hyperparameter tuning. In the proposed hybrid model we have found better accuracy (CM1: 95.2%, KC1: 86.6%) than other methods. The proposed model also presents higher evaluation in the terms of other criteria. As a limitation, the only problem with SVM is there is no probabilistic explanation for classification which may very rigid towards classifications. In the future, some other method may also introduce which can overcome this limitation and keep a soft probabilistic based margin for classification on the optimal hyperplane.



中文翻译:

基于主成分的支持向量机 (PC-SVM):一种用于软件缺陷检测的混合技术

缺陷是当前形势下的主要问题,预测它们也是一项艰巨的任务。研究人员和科学家已经开发了许多软件缺陷预测技术来克服这个非常有用的问题。但在某种程度上,需要一种算法/方法来更准确地预测缺陷,降低时间和空间复杂性。之前对没有特征减少的数据进行的所有研究都导致了维度灾难。我们通过结合主成分分析 (PCA) 和支持向量机 (SVM) 提出了一种机器学习混合方法,以克服持续存在的问题。我们使用了来自 NASA 目录的 PROMISE(CM1:344 次观测,​​KC1:2109 次观测)数据来进行我们的研究。我们将数据集拆分为训练(CM1:240 次观察,KC1:1476 个观察)数据集和测试(CM1:104 个观察,KC1:633 个观察)数据集。使用 PCA,我们找到了降低时间复杂度的特征优化的主要成分。然后,由于与传统和传统方法相比非常原生的特性,我们将 SVM 应用于分类。我们还使用 GridSearchCV 方法进行超参数调整。在提出的混合模型中,我们发现比其他方法更好的准确度(CM1:95.2%,KC1:86.6%)。所提出的模型在其他标准方面也提出了更高的评价。作为一个限制,SVM 的唯一问题是没有对分类的概率解释,这可能对分类非常严格。将来,

更新日期:2021-04-16
down
wechat
bug