NEDetector: Automatically extracting cybersecurity neologisms from hacker forums,Journal of Information Security and Applications

当前位置： X-MOL 学术 › J. Inf. Secur. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

NEDetector: Automatically extracting cybersecurity neologisms from hacker forums
Journal of Information Security and Applications ( IF 3.8 ) Pub Date : 2021-02-10 , DOI: 10.1016/j.jisa.2021.102784
Ying Li , Jiaxing Cheng , Cheng Huang , Zhouguo Chen , Weina Niu

Underground hacker forums serve as an online social platform for hackers to communicate and spread hacking techniques and tools. In these forums, a lot of latest information indirectly or directly affects cyberspace security, thereby threatening the assets of enterprises or individuals. Therefore, social media such as hacker forums and twitter have a great impact on the cybersecurity area. In recent years, analyzing hacker forum data to explore hacking activities and cybersecurity situational awareness have aroused widespread interest among researchers. Automatically identifying cybersecurity words and extracting neologisms from open source social platforms are less successful and still require further research. In order to provide early warning of cyber attack incidents, we proposed NEDetector, a novel method to automatically identify cybersecurity words and extract neologisms from unstructured content, mainly focus on attack groups and hacking tools. NEDetector firstly analyzes the cybersecurity words and proposes four group features to build cybersecurity words identification model based on Bidirectional LSTM algorithm. Secondly, NEDetector introduces 4 sets of features to identify cybersecurity neologisms based on RandomForest algorithm. The experiment result shows that the whole system of NEDetector achieves an identification precision of 89.11%. Furthermore, the proposed extracting neologisms system is often earlier than having enough data in Google Trends when performing predictions on Twitter data, which prove the validity and timeliness of presented system.

中文翻译：

NEDetector：从黑客论坛自动提取网络安全新词

地下黑客论坛是黑客的一个在线社交平台，用于交流和传播黑客技术和工具。在这些论坛中，许多最新信息间接或直接影响网络空间安全，从而威胁到企业或个人的资产。因此，诸如黑客论坛和推特之类的社交媒体对网络安全领域具有重大影响。近年来，分析黑客论坛数据以探索黑客活动和网络安全态势意识引起了研究人员的广泛兴趣。从开放源代码社交平台自动识别网络安全词语并提取新词并不成功，但仍需要进一步研究。为了提供网络攻击事件的预警，我们提出了NEDetector，一种自动识别网络安全单词并从非结构化内容中提取新词的新颖方法，主要侧重于攻击组和黑客工具。NEDetector首先分析了网络安全词，并提出了四个组特征，以基于双向LSTM算法构建网络安全词识别模型。其次，NEDetector引入了四组基于RandomForest算法识别网络安全新词的功能。实验结果表明，NEDetector整个系统的识别精度达到89.11％。此外，在对Twitter数据进行预测时，提出的提取新词系统通常比在Google趋势中拥有足够的数据要早，这证明了所提出系统的有效性和及时性。主要侧重于攻击组和黑客工具。NEDetector首先分析了网络安全词，并提出了四个组特征，以基于双向LSTM算法构建网络安全词识别模型。其次，NEDetector引入了四组基于RandomForest算法识别网络安全新词的功能。实验结果表明，NEDetector整个系统的识别精度达到89.11％。此外，在对Twitter数据进行预测时，提出的提取新词系统通常比在Google趋势中拥有足够的数据要早，这证明了所提出系统的有效性和及时性。主要侧重于攻击组和黑客工具。NEDetector首先分析了网络安全词，并提出了四个组特征，以基于双向LSTM算法构建网络安全词识别模型。其次，NEDetector引入了四组基于RandomForest算法识别网络安全新词的功能。实验结果表明，NEDetector整个系统的识别精度达到89.11％。此外，在对Twitter数据进行预测时，提出的提取新词系统通常比在Google趋势中拥有足够的数据要早，这证明了所提出系统的有效性和及时性。NEDetector首先分析了网络安全词，并提出了四个组特征，以基于双向LSTM算法构建网络安全词识别模型。其次，NEDetector引入了四组基于RandomForest算法识别网络安全新词的功能。实验结果表明，NEDetector整个系统的识别精度达到89.11％。此外，在对Twitter数据进行预测时，提出的提取新词系统通常比在Google趋势中拥有足够的数据要早，这证明了所提出系统的有效性和及时性。NEDetector首先分析了网络安全词，并提出了四个组特征，以基于双向LSTM算法构建网络安全词识别模型。其次，NEDetector引入了四组基于RandomForest算法识别网络安全新词的功能。实验结果表明，NEDetector整个系统的识别精度达到89.11％。此外，在对Twitter数据进行预测时，提出的提取新词系统通常比在Google趋势中拥有足够的数据要早，这证明了所提出系统的有效性和及时性。实验结果表明，NEDetector整个系统的识别精度达到89.11％。此外，在对Twitter数据进行预测时，提出的提取新词系统通常比在Google趋势中拥有足够的数据要早，这证明了所提出系统的有效性和及时性。实验结果表明，NEDetector整个系统的识别精度达到89.11％。此外，在对Twitter数据进行预测时，提出的提取新词系统通常比在Google趋势中拥有足够的数据要早，这证明了所提出系统的有效性和及时性。

更新日期：2021-02-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文