Proactively Identifying Emerging Hacker Threats from the Dark Web,ACM Transactions on Privacy and Security

当前位置： X-MOL 学术 › ACM Trans. Priv. Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Proactively Identifying Emerging Hacker Threats from the Dark Web
ACM Transactions on Privacy and Security ( IF 2.3 ) Pub Date : 2020-08-26 , DOI: 10.1145/3409289
Sagar Samtani ₁ , Hongyi Zhu ₂ , Hsinchun Chen ₃

Affiliation

Cybersecurity experts have appraised the total global cost of malicious hacking activities to be $450 billion annually. Cyber Threat Intelligence (CTI) has emerged as a viable approach to combat this societal issue. However, existing processes are criticized as inherently reactive to known threats. To combat these concerns, CTI experts have suggested proactively examining emerging threats in the vast, international online hacker community. In this study, we aim to develop proactive CTI capabilities by exploring online hacker forums to identify emerging threats in terms of popularity and tool functionality. To achieve these goals, we create a novel Diachronic Graph Embedding Framework (D-GEF). D-GEF operates on a Graph-of-Words (GoW) representation of hacker forum text to generate word embeddings in an unsupervised manner. Semantic displacement measures adopted from diachronic linguistics literature identify how terminology evolves. A series of benchmark experiments illustrate D-GEF's ability to generate higher quality than state-of-the-art word embedding models (e.g., word2vec) in tasks pertaining to semantic analogy, clustering, and threat classification. D-GEF's practical utility is illustrated with in-depth case studies on web application and denial of service threats targeting PHP and Windows technologies, respectively. We also discuss the implications of the proposed framework for strategic, operational, and tactical CTI scenarios. All datasets and code are publicly released to facilitate scientific reproducibility and extensions of this work.

中文翻译：

主动识别来自暗网的新兴黑客威胁

网络安全专家估计，全球恶意黑客活动的总成本为每年 4500 亿美元。网络威胁情报 (CTI) 已成为解决这一社会问题的可行方法。然而，现有流程被批评为对已知威胁具有固有的反应性。为了解决这些问题，CTI 专家建议在庞大的国际在线黑客社区中主动检查新出现的威胁。在这项研究中，我们旨在通过探索在线黑客论坛来开发主动 CTI 功能，以识别流行度和工具功能方面的新兴威胁。为了实现这些目标，我们创建了一个新颖的历时图嵌入框架 (D-GEF)。D-GEF 对黑客论坛文本的 Graph-of-Words (GoW) 表示进行操作，以无监督的方式生成词嵌入。从历时语言学文献中采用的语义置换措施确定了术语是如何演变的。一系列基准实验说明了 D-GEF 在与语义类比、聚类和威胁分类有关的任务中生成比最先进的词嵌入模型（例如 word2vec）更高质量的能力。D-GEF 的实用性通过对 Web 应用程序和针对 PHP 和 Windows 技术的拒绝服务威胁的深入案例研究进行了说明。我们还讨论了拟议框架对战略、运营和战术 CTI 情景的影响。所有数据集和代码都公开发布，以促进这项工作的科学可重复性和扩展。一系列基准实验说明了 D-GEF 在与语义类比、聚类和威胁分类有关的任务中生成比最先进的词嵌入模型（例如 word2vec）更高质量的能力。D-GEF 的实用性通过对 Web 应用程序和针对 PHP 和 Windows 技术的拒绝服务威胁的深入案例研究进行了说明。我们还讨论了拟议框架对战略、运营和战术 CTI 情景的影响。所有数据集和代码都公开发布，以促进这项工作的科学可重复性和扩展。一系列基准实验说明了 D-GEF 在与语义类比、聚类和威胁分类有关的任务中生成比最先进的词嵌入模型（例如 word2vec）更高质量的能力。D-GEF 的实用性通过对 Web 应用程序和针对 PHP 和 Windows 技术的拒绝服务威胁的深入案例研究进行了说明。我们还讨论了拟议框架对战略、运营和战术 CTI 情景的影响。所有数据集和代码都公开发布，以促进这项工作的科学可重复性和扩展。通过对 Web 应用程序和分别针对 PHP 和 Windows 技术的拒绝服务威胁的深入案例研究来说明其实用性。我们还讨论了拟议框架对战略、运营和战术 CTI 情景的影响。所有数据集和代码都公开发布，以促进这项工作的科学可重复性和扩展。通过对 Web 应用程序和分别针对 PHP 和 Windows 技术的拒绝服务威胁的深入案例研究来说明其实用性。我们还讨论了拟议框架对战略、运营和战术 CTI 情景的影响。所有数据集和代码都公开发布，以促进这项工作的科学可重复性和扩展。

更新日期：2020-08-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>