PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU Learning,arXiv - CS - Software Engineering

当前位置： X-MOL 学术 › arXiv.cs.SE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU Learning
arXiv - CS - Software Engineering Pub Date : 2020-03-08 , DOI: arxiv-2003.03741
Triet H. M. Le, David Hin, Roland Croft, M. Ali Babar

Security is an increasing concern in software development. Developer Question and Answer (Q&A) websites provide a large amount of security discussion. Existing studies have used human-defined rules to mine security discussions, but these works still miss many posts, which may lead to an incomplete analysis of the security practices reported on Q&A websites. Traditional supervised Machine Learning methods can automate the mining process; however, the required negative (non-security) class is too expensive to obtain. We propose a novel learning framework, PUMiner, to automatically mine security posts from Q&A websites. PUMiner builds a context-aware embedding model to extract features of the posts, and then develops a two-stage PU model to identify security content using the labelled Positive and Unlabelled posts. We evaluate PUMiner on more than 17.2 million posts on Stack Overflow and 52,611 posts on Security StackExchange. We show that PUMiner is effective with the validation performance of at least 0.85 across all model configurations. Moreover, Matthews Correlation Coefficient (MCC) of PUMiner is 0.906, 0.534 and 0.084 points higher than one-class SVM, positive-similarity filtering, and one-stage PU models on unseen testing posts, respectively. PUMiner also performs well with an MCC of 0.745 for scenarios where string matching totally fails. Even when the ratio of the labelled positive posts to the unlabelled ones is only 1:100, PUMiner still achieves a strong MCC of 0.65, which is 160% better than fully-supervised learning. Using PUMiner, we provide the largest and up-to-date security content on Q&A websites for practitioners and researchers.

中文翻译：

PUMiner：使用 PU 学习从开发人员问答网站中挖掘安全帖子

安全性是软件开发中日益受到关注的问题。开发人员问答 (Q&A) 网站提供了大量的安全讨论。现有研究已经使用人为定义的规则来挖掘安全讨论，但这些工作仍然遗漏了许多帖子，这可能导致对问答网站上报告的安全实践的分析不完整。传统的监督式机器学习方法可以使挖掘过程自动化；然而，获得所需的否定（非安全）类太昂贵了。我们提出了一种新颖的学习框架 PUMiner，可以自动从问答网站中挖掘安全帖子。PUMiner 构建了一个上下文感知嵌入模型来提取帖子的特征，然后开发一个两阶段 PU 模型来使用标记的 Positive 和 Unlabelled 帖子识别安全内容。我们对 PUMiner 的 Stack Overflow 上超过 1720 万个帖子和 Security StackExchange 上的 52,611 个帖子进行了评估。我们表明 PUMiner 是有效的，所有模型配置的验证性能至少为 0.85。此外，PUMiner 的马修斯相关系数 (MCC) 分别比一类 SVM、正相似性过滤和一级 PU 模型在看不见的测试岗位上高 0.906、0.534 和 0.084 个点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。Stack Overflow 上有 200 万个帖子，Security StackExchange 上有 52,611 个帖子。我们表明 PUMiner 是有效的，所有模型配置的验证性能至少为 0.85。此外，PUMiner 的马修斯相关系数 (MCC) 分别比一类 SVM、正相似性过滤和一级 PU 模型在看不见的测试岗位上高 0.906、0.534 和 0.084 个点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。Stack Overflow 上有 200 万个帖子，Security StackExchange 上有 52,611 个帖子。我们表明 PUMiner 是有效的，所有模型配置的验证性能至少为 0.85。此外，PUMiner 的马修斯相关系数 (MCC) 分别比一类 SVM、正相似性过滤和一级 PU 模型在看不见的测试岗位上高 0.906、0.534 和 0.084 个点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。Security StackExchange 上的 611 个帖子。我们表明 PUMiner 是有效的，所有模型配置的验证性能至少为 0.85。此外，PUMiner 的马修斯相关系数 (MCC) 分别比一类 SVM、正相似性过滤和一级 PU 模型在看不见的测试岗位上高 0.906、0.534 和 0.084 个点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。Security StackExchange 上的 611 个帖子。我们表明 PUMiner 是有效的，所有模型配置的验证性能至少为 0.85。此外，PUMiner 的马修斯相关系数 (MCC) 分别比一类 SVM、正相似性过滤和一级 PU 模型在看不见的测试岗位上高 0.906、0.534 和 0.084 个点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。我们表明 PUMiner 是有效的，所有模型配置的验证性能至少为 0.85。此外，PUMiner 的马修斯相关系数 (MCC) 分别比一类 SVM、正相似性过滤和一级 PU 模型在看不见的测试岗位上高 0.906、0.534 和 0.084 个点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。我们表明 PUMiner 是有效的，所有模型配置的验证性能至少为 0.85。此外，PUMiner 的马修斯相关系数 (MCC) 分别比一类 SVM、正相似性过滤和一级 PU 模型在看不见的测试岗位上高 0.906、0.534 和 0.084 个点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。在看不见的测试岗位上，分别比一类 SVM、正相似性过滤和一级 PU 模型高 906、0.534 和 0.084 点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。在看不见的测试岗位上，分别比一类 SVM、正相似性过滤和一级 PU 模型高 906、0.534 和 0.084 点。对于字符串匹配完全失败的场景，PUMiner 也表现良好，MCC 为 0.745。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。即使标记的正面帖子与未标记的帖子的比例仅为 1:100，PUMiner 仍然实现了 0.65 的强大 MCC，这比全监督学习好 160%。使用 PUMiner，我们在问答网站上为从业者和研究人员提供最大和最新的安全内容。

更新日期：2020-03-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文