LTRWES: A new framework for security bug report detection,Information and Software Technology

当前位置： X-MOL 学术 › Inf. Softw. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

LTRWES: A new framework for security bug report detection
Information and Software Technology ( IF 3.8 ) Pub Date : 2020-04-13 , DOI: 10.1016/j.infsof.2020.106314
Yuan Jiang , Pengcheng Lu , Xiaohong Su , Tiantian Wang

Context: Security bug reports (SBRs) usually contain security-related vulnerabilities in software products, which could be exploited by malicious attackers. Hence, it is important to identify SBRs quickly and accurately among bug reports (BRs) that have been disclosed in bug tracking systems. Although a few methods have been already proposed for the detection of SBRs, challenging issues still remain due to noisy samples, class imbalance and data scarcity.

Object: This motivates us to reveal the potential challenges faced by the state-of-the-art SBRs prediction methods from the viewpoint of data filtering and representation. Furthermore, the purpose of this paper is also to provide a general framework and new solutions to solve these problems.

Method: In this study, we propose a novel approach LTRWES that incorporates learning to rank and word embedding into the identification of SBRs. Unlike previous keyword-based approaches, LTRWES is a content-based data filtering and representation framework that has several desirable properties not shared in other methods. Firstly, it exploits ranking model to efficiently filter non-security bug reports (NSBRs) that have higher content similarity with respect to SBRs. Secondly, it applies word embedding technology to transform the rest of NSBRs, together with SBRs, into low-dimensional real-value vectors.

Result: Experiment results on benchmark and large real-world datasets show that our proposed method outperforms the state-of-the-art method.

Conclusion: Overall, the LTRWES is valid with high performance. It will help security engineers to identify SBRs from thousands of NSBRs more accurately than existing algorithms. Therefore, this will positively encourage the research and development of the content-based methods for security bug report detection.

中文翻译：

LTRWES：用于安全错误报告检测的新框架

上下文：安全错误报告（SBR）通常包含软件产品中与安全相关的漏洞，恶意攻击者可以利用此漏洞。因此，重要的是要在错误跟踪系统中公开的错误报告（BR）中快速准确地识别SBR。尽管已经提出了几种检测SBR的方法，但由于噪声样本，类不平衡和数据稀缺，仍然存在挑战性问题。

对象：这促使我们揭示面临从数据过滤和代表性的观点来看的状态的最先进的SBR预测方法的潜在挑战。此外，本文的目的还在于提供一个通用框架和新的解决方案来解决这些问题。

方法：在这项研究中，我们提出了一种新的方法LTRWES，该方法将学习排序和单词嵌入纳入SBR的识别中。与以前的基于关键字的方法不同，LTRWES是基于内容的数据过滤和表示框架，具有一些其他方法无法共享的理想属性。首先，它利用排名模型来有效过滤与SBR具有更高内容相似性的非安全错误报告（NSBR）。其次，它使用词嵌入技术将其余的NSBR和SBR转换为低维实值向量。

结果：在基准数据和大型现实数据集上的实验结果表明，我们提出的方法优于最新方法。

结论：总体而言，LTRWES具有良好的性能。与现有算法相比，它将帮助安全工程师更准确地从数千个NSBR中识别SBR。因此，这将积极鼓励研究和开发基于内容的安全漏洞报告检测方法。

更新日期：2020-04-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11