当前位置: X-MOL 学术Cybersecurity › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data and knowledge-driven named entity recognition for cyber security
Cybersecurity Pub Date : 2021-05-03 , DOI: 10.1186/s42400-021-00072-y
Chen Gao , Xuan Zhang , Hui Liu

Named Entity Recognition (NER) for cyber security aims to identify and classify cyber security terms from a large number of heterogeneous multisource cyber security texts. In the field of machine learning, deep neural networks automatically learn text features from a large number of datasets, but this data-driven method usually lacks the ability to deal with rare entities. Gasmi et al. proposed a deep learning method for named entity recognition in the field of cyber security, and achieved good results, reaching an F1 value of 82.8%. But it is difficult to accurately identify rare entities and complex words in the text.To cope with this challenge, this paper proposes a new model that combines data-driven deep learning methods with knowledge-driven dictionary methods to build dictionary features to assist in rare entity recognition. In addition, based on the data-driven deep learning model, an attention mechanism is adopted to enrich the local features of the text, better models the context, and improves the recognition effect of complex entities. Experimental results show that our method is better than the baseline model. Our model is more effective in identifying cyber security entities. The Precision, Recall and F1 value reached 90.19%, 86.60% and 88.36% respectively.



中文翻译:

数据和知识驱动的命名实体识别,可确保网络安全

网络安全的命名实体识别(NER)旨在从大量异构的多源网络安全文本中识别和分类网络安全术语。在机器学习领域,深度神经网络会自动从大量数据集中学习文本特征,但是这种数据驱动的方法通常缺乏处理稀有实体的能力。加斯米(Gasmi)等。提出了一种用于网络安全领域的命名实体识别的深度学习方法,并取得了良好的效果,F1值达到82.8%。但是很难准确地识别文本中的稀有实体和复杂词。为应对这一挑战,本文提出了一种新模型,该模型将数据驱动的深度学习方法与知识驱动的字典方法相结合,以构建字典功能来辅助稀有文本。实体识别。此外,在基于数据驱动的深度学习模型的基础上,采用注意力机制来丰富文本的局部特征,更好地对上下文进行建模,并提高复杂实体的识别效果。实验结果表明,我们的方法优于基线模型。我们的模型在识别网络安全实体方面更有效。精度,召回率和F1值分别达到90.19%,86.60%和88.36%。

更新日期:2021-05-03
down
wechat
bug