Deep Learning based Vulnerability Detection: Are We There Yet?,arXiv - CS - Software Engineering

当前位置： X-MOL 学术 › arXiv.cs.SE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Learning based Vulnerability Detection: Are We There Yet?
arXiv - CS - Software Engineering Pub Date : 2020-09-03 , DOI: arxiv-2009.07235
Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, Baishakhi Ray

Automated detection of software vulnerabilities is a fundamental problem in software security. Existing program analysis techniques either suffer from high false positives or false negatives. Recent progress in Deep Learning (DL) has resulted in a surge of interest in applying DL for automated vulnerability detection. Several recent studies have demonstrated promising results achieving an accuracy of up to 95% at detecting vulnerabilities. In this paper, we ask, "how well do the state-of-the-art DL-based techniques perform in a real-world vulnerability prediction scenario?". To our surprise, we find that their performance drops by more than 50%. A systematic investigation of what causes such precipitous performance drop reveals that existing DL-based vulnerability prediction approaches suffer from challenges with the training data (e.g., data duplication, unrealistic distribution of vulnerable classes, etc.) and with the model choices (e.g., simple token-based models). As a result, these approaches often do not learn features related to the actual cause of the vulnerabilities. Instead, they learn unrelated artifacts from the dataset (e.g., specific variable/function names, etc.). Leveraging these empirical findings, we demonstrate how a more principled approach to data collection and model design, based on realistic settings of vulnerability prediction, can lead to better solutions. The resulting tools perform significantly better than the studied baseline: up to 33.57% boost in precision and 128.38% boost in recall compared to the best performing model in the literature. Overall, this paper elucidates existing DL-based vulnerability prediction systems' potential issues and draws a roadmap for future DL-based vulnerability prediction research. In that spirit, we make available all the artifacts supporting our results: https://git.io/Jf6IA.

中文翻译：

基于深度学习的漏洞检测：我们到了吗？

软件漏洞的自动检测是软件安全中的一个基本问题。现有的程序分析技术要么遭受高误报，要么遭受误报。深度学习 (DL) 的最新进展导致人们对将 DL 应用于自动化漏洞检测的兴趣激增。最近的几项研究表明，在检测漏洞方面的准确率高达 95%。在本文中，我们问，“最先进的基于深度学习的技术在现实世界的漏洞预测场景中表现如何？”。令我们惊讶的是，我们发现它们的性能下降了 50% 以上。对导致这种急剧性能下降的原因进行的系统调查表明，现有的基于深度学习的漏洞预测方法面临着训练数据的挑战（例如，数据重复、易受攻击类别的不切实际分布等）以及模型选择（例如，简单的基于令牌的模型）。因此，这些方法通常不会学习与漏洞的实际原因相关的特征。相反，他们从数据集中学习不相关的工件（例如，特定的变量/函数名称等）。利用这些实证结果，我们展示了基于漏洞预测的现实设置的更有原则的数据收集和模型设计方法如何能够产生更好的解决方案。所得工具的性能明显优于研究的基线：与文献中性能最佳的模型相比，精度提高了 33.57%，召回率提高了 128.38%。总体而言，本文阐明了现有的基于 DL 的漏洞预测系统的潜在问题并为未来基于深度学习的漏洞预测研究绘制路线图。本着这种精神，我们提供了支持我们结果的所有工件：https://git.io/Jf6IA。

更新日期：2020-09-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文