BGNN4VD: Constructing Bidirectional Graph Neural-Network for Vulnerability Detection,Information and Software Technology

当前位置： X-MOL 学术 › Inf. Softw. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

BGNN4VD: Constructing Bidirectional Graph Neural-Network for Vulnerability Detection
Information and Software Technology ( IF 3.8 ) Pub Date : 2021-03-20 , DOI: 10.1016/j.infsof.2021.106576
Sicong Cao , Xiaobing Sun , Lili Bo , Ying Wei , Bin Li

Context:

Previous studies have shown that existing deep learning-based approaches can significantly improve the performance of vulnerability detection. They represent code in various forms and mine vulnerability features with deep learning models. However, the differences of code representation forms and deep learning models make various approaches still have some limitations. In practice, their false-positive rate (FPR) and false-negative rate (FNR) are still high.

Objective:

To address the limitations of existing vulnerability detection approaches, we propose BGNN4VD (Bidirectional Graph Neural Network for Vulnerability Detection), a vulnerability detection approach by constructing a Bidirectional Graph Neural-Network (BGNN).

Method:

In Phase 1, we extract the syntax and semantic information of source code through abstract syntax tree (AST), control flow graph (CFG), and data flow graph (DFG). Then in Phase 2, we use vectorized source code as input to Bidirectional Graph Neural-Network (BGNN). In Phase 3, we learn the different features between vulnerable code and non-vulnerable code by introducing backward edges on the basis of traditional Graph Neural-Network (GNN). Finally in Phase 4, a Convolutional Neural-Network (CNN) is used to further extract features and detect vulnerabilities through a classifier.

Results:

We evaluate BGNN4VD on 4 popular C/C++ projects from NVD and GitHub, and compare it with four state-of-the-art (Flawfinder, RATS, SySeVR, and VUDDY). Experiment results show that, when compared these these baselines, BGNN4VD achieves 4.9%, 11.0%, and 8.4% improvement in F1-measure, accuracy and precision, respectively.

Conclusion:

The proposed BGNN4VD achieves a higher precision and accuracy than the state-of-the-art methods. In addition, when applied on the latest vulnerabilities reported by CVE, BGNN4VD can still achieve a precision at 45.1%, which demonstrates the feasibility of BGNN4VD in practical application.

中文翻译：

BGNN4VD：构造用于漏洞检测的双向图神经网络

语境：

先前的研究表明，现有的基于深度学习的方法可以显着提高漏洞检测的性能。它们代表各种形式的代码，并通过深度学习模型挖掘漏洞特征。但是，代码表示形式和深度学习模型的差异使各种方法仍然存在一些局限性。实际上，它们的假阳性率（FPR）和假阴性率（FNR）仍然很高。

客观的：

为了解决现有漏洞检测方法的局限性，我们提出了BGNN4VD（用于漏洞检测的双向图神经网络），它是通过构建双向图神经网络（BGNN）来进行漏洞检测的方法。

方法：

在阶段1中，我们通过抽象语法树（AST），控制流图（CFG）和数据流图（DFG）提取源代码的语法和语义信息。然后在阶段2中，我们使用矢量化的源代码作为双向图神经网络（BGNN）的输入。在第3阶段中，我们通过在传统的图形神经网络（GNN）的基础上引入后向边缘，来学习易受攻击的代码与不可受攻击的代码之间的不同功能。最终在阶段4中，使用卷积神经网络（CNN）进一步提取特征并通过分类器检测漏洞。

结果：

我们在NVD和GitHub的4个流行的C / C ++项目上评估BGNN4VD，并将其与四个最新技术（Flawfinder，RATS，SySeVR和VUDDY）进行比较。实验结果表明，与这些基准进行比较时，BGNN4VD的F1度量，准确性和精度分别提高了4.9％，11.0％和8.4％。