The impact factors on the performance of machine learning-based vulnerability detection: A comparative study,Journal of Systems and Software

当前位置： X-MOL 学术 › J. Syst. Softw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The impact factors on the performance of machine learning-based vulnerability detection: A comparative study
Journal of Systems and Software ( IF 3.7 ) Pub Date : 2020-10-01 , DOI: 10.1016/j.jss.2020.110659
Wei Zheng , Jialiang Gao , Xiaoxue Wu , Fengyu Liu , Yuxing Xun , Guoliang Liu , Xiang Chen

Abstract Machine learning-based Vulnerability detection is an active research topic in software security. Different traditional machine learning-based and deep learning-based vulnerability detection methods have been proposed. To our best knowledge, we are the first to identify four impact factors and conduct a comparative study to investigate the performance influence of these factors. In particular, the quality of datasets, classification models and vectorization methods can directly affect the detection performance, in contrast function/variable name replacement can affect the features of vulnerability detection and indirectly affect the performance. We collect three different vulnerability code datasets from two various sources (i.e., NVD and SARD). These datasets can correspond to different types of vulnerabilities. Moreover, we extract and analyze the features of vulnerability code datasets to explain some experimental results. Our findings based on the experimental results can be summarized as follows: (1) Deep learning models can achieve better performance than traditional machine learning models. Of all the models, BLSTM can achieve the best performance. (2) CountVectorizer can significantly improve the performance of traditional machine learning models. (3) Features generated by the random forest algorithm include system-related functions, syntax keywords, and user-defined names. Different vulnerability types and code sources will generate different features. (4) Datasets with user-defined variable and function name replacement will decrease the performance of vulnerability detection. (5) As the proportion of code from SARD increases, the performance of vulnerability detection will increase.

中文翻译：

基于机器学习的漏洞检测性能影响因素的比较研究

摘要基于机器学习的漏洞检测是软件安全领域一个活跃的研究课题。已经提出了不同的传统的基于机器学习和基于深度学习的漏洞检测方法。据我们所知，我们是第一个确定四个影响因素并进行比较研究以调查这些因素对绩效的影响的人。特别是数据集的质量、分类模型和向量化方法可以直接影响检测性能，而函数/变量名替换可以影响漏洞检测的特征并间接影响性能。我们从两个不同的来源（即 NVD 和 SARD）收集三个不同的漏洞代码数据集。这些数据集可以对应不同类型的漏洞。而且，我们提取和分析漏洞代码数据集的特征来解释一些实验结果。我们基于实验结果的发现可以总结如下：（1）深度学习模型可以实现比传统机器学习模型更好的性能。在所有模型中，BLSTM 可以达到最好的性能。(2) CountVectorizer 可以显着提升传统机器学习模型的性能。(3) 随机森林算法生成的特征包括系统相关函数、语法关键字和用户自定义名称。不同的漏洞类型和代码来源会产生不同的特征。(4) 用户自定义变量和函数名替换的数据集会降低漏洞检测的性能。(5) 随着来自 SARD 的代码比例的增加，

更新日期：2020-10-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11