当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Zero-shot fine-grained entity typing in information security based on ontology
Knowledge-Based Systems ( IF 7.2 ) Pub Date : 2021-09-15 , DOI: 10.1016/j.knosys.2021.107472
Han Zhang 1 , Jiaxian Zhu 1 , Jicheng Chen 2 , Junxiu Liu 3 , Lixia Ji 1
Affiliation  

The field of information security suffers from the lack of labelled entities. This study proposes a zero-shot hybrid approach, combining a clustering algorithm with a method for representing category labels, to classify fine-grained entity typing based on unified cybersecurity ontology (UCO) to address this issue. However, certain category labels in UCO do not have distinct domain features, while certain abbreviations cannot be obtained directly from word embedding using Word2vec. Thus, we propose a new method, referred to as mixed entities and hierarchy of UCO (MEHC), to represent the category labels. Moreover, to further improve the performance of fine-grained entity typing we propose the triClustering algorithm to re-cluster coarse-grained classification results or determine corresponding types for new entities, based on the theorem that the sum of two sides of a triangle is greater than the third. The experimental results prove that our triClustering algorithm can effectively shorten the computation time and that the proposed hybrid method is superior to other baselines for information security applications.



中文翻译:

基于本体的信息安全零样本细粒度实体类型化

信息安全领域缺乏标记实体。本研究提出了一种零样本混合方法,将聚类算法与表示类别标签的方法相结合,对基于统一网络安全本体 (UCO) 的细粒度实体类型进行分类以解决此问题。然而,UCO 中的某些类别标签没有明显的领域特征,而某些缩写不能直接从使用 Word2vec 的词嵌入中获得。因此,我们提出了一种新方法,称为 UCO 的混合实体和层次结构(MEHC),来表示类别标签。此外,为了进一步提高细粒度实体类型的性能,我们提出了 triClustering 算法来重新聚类粗粒度分类结果或为新实体确定相应的类型,基于三角形的两条边之和大于第三条的定理。实验结果证明我们的 triClustering 算法可以有效缩短计算时间,并且所提出的混合方法优于其他信息安全应用基线。

更新日期:2021-09-15
down
wechat
bug