A review on cyber security named entity recognition,Frontiers of Information Technology & Electronic Engineering

当前位置： X-MOL 学术 › Front. Inform. Technol. Electron. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A review on cyber security named entity recognition
Frontiers of Information Technology & Electronic Engineering ( IF 2.7 ) Pub Date : 2021-09-16 , DOI: 10.1631/fitee.2000286
Chen Gao ₁ , Xuan Zhang _{1,

2,

3} , Mengting Han ₁ , Hui Liu ₁

Affiliation

With the rapid development of Internet technology and the advent of the era of big data, more and more cyber security texts are provided on the Internet. These texts include not only security concepts, incidents, tools, guidelines, and policies, but also risk management approaches, best practices, assurances, technologies, and more. Through the integration of large-scale, heterogeneous, unstructured cyber security information, the identification and classification of cyber security entities can help handle cyber security issues. Due to the complexity and diversity of texts in the cyber security domain, it is difficult to identify security entities in the cyber security domain using the traditional named entity recognition (NER) methods. This paper describes various approaches and techniques for NER in this domain, including the rule-based approach, dictionary-based approach, and machine learning based approach, and discusses the problems faced by NER research in this domain, such as conjunction and disjunction, non-standardized naming convention, abbreviation, and massive nesting. Three future directions of NER in cyber security are proposed: (1) application of unsupervised or semi-supervised technology; (2) development of a more comprehensive cyber security ontology; (3) development of a more comprehensive deep learning model.

中文翻译：

网络安全命名实体识别综述

随着互联网技术的飞速发展和大数据时代的到来，越来越多的网络安全文本出现在互联网上。这些文本不仅包括安全概念、事件、工具、指南和政策，还包括风险管理方法、最佳实践、保证、技术等。通过整合大规模、异构、非结构化的网络安全信息，对网络安全实体进行识别和分类，有助于处理网络安全问题。由于网络安全领域文本的复杂性和多样性，使用传统的命名实体识别（NER）方法难以识别网络安全领域中的安全实体。本文描述了该领域中 NER 的各种方法和技术，包括基于规则的方法，基于字典的方法和基于机器学习的方法，并讨论了 NER 研究在该领域面临的问题，例如合取和分离、非标准化命名约定、缩写和大规模嵌套。提出了 NER 在网络安全方面的三个未来方向：（1）无监督或半监督技术的应用；(2) 开发更全面的网络安全本体；(3) 开发更全面的深度学习模型。(2) 开发更全面的网络安全本体；(3) 开发更全面的深度学习模型。(2) 开发更全面的网络安全本体；(3) 开发更全面的深度学习模型。

更新日期：2021-09-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11