当前位置: X-MOL 学术Front. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Entity set expansion in knowledge graph: a heterogeneous information network perspective
Frontiers of Computer Science ( IF 3.4 ) Pub Date : 2020-09-29 , DOI: 10.1007/s11704-020-9240-8
Chuan Shi , Jiayu Ding , Xiaohuan Cao , Linmei Hu , Bin Wu , Xiaoli Li

Entity set expansion (ESE) aims to expand an entity seed set to obtain more entities which have common properties. ESE is important for many applications such as dictionary construction and query suggestion. Traditional ESE methods relied heavily on the text and Web information of entities. Recently, some ESE methods employed knowledge graphs (KGs) to extend entities. However, they failed to effectively and efficiently utilize the rich semantics contained in a KG and ignored the text information of entities in Wikipedia. In this paper, we model a KG as a heterogeneous information network (HIN) containing multiple types of objects and relations. Fine-grained multi-type meta paths are proposed to capture the hidden relation among seed entities in a KG and thus to retrieve candidate entities. Then we rank the entities according to the meta path based structural similarity. Furthermore, to utilize the text description of entities in Wikipedia, we propose an extended model CoMeSE++ which combines both structural information revealed by a KG and text information in Wikipedia for ESE. Extensive experiments on real-world datasets demonstrate that our model achieves better performance by combining structural and textual information of entities.



中文翻译:

知识图中的实体集扩展:异构信息网络的观点

实体集扩展(ESE)旨在扩展实体种子集,以获得更多具有共同属性的实体。ESE对于许多应用程序都很重要,例如字典构建和查询建议。传统的ESE方法严重依赖实体的文本和Web信息。最近,一些ESE方法采用知识图(KG)扩展实体。但是,他们无法有效地利用KG中包含的丰富语义,而忽略了Wikipedia中实体的文本信息。在本文中,我们将KG建模为包含多种类型的对象和关系的异构信息网络(HIN)。提出了细粒度的多类型元路径,以捕获KG中种子实体之间的隐藏关系,从而检索候选实体。然后,我们根据基于元路径的结构相似性对实体进行排名。此外,为了利用Wikipedia中实体的文本描述,我们提出了扩展模型CoMeSE ++,该模型结合了KG揭示的结构信息和Wikipedia中用于ESE的文本信息。在现实世界的数据集上进行的大量实验表明,我们的模型通过结合实体的结构和文本信息来实现更好的性能。

更新日期:2020-09-29
down
wechat
bug