当前位置: X-MOL 学术Eng. Appl. Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CAFE: Knowledge graph completion using neighborhood-aware features
Engineering Applications of Artificial Intelligence ( IF 7.5 ) Pub Date : 2021-05-18 , DOI: 10.1016/j.engappai.2021.104302
Agustín Borrego , Daniel Ayala , Inma Hernández , Carlos R. Rivero , David Ruiz

Knowledge Graphs (KGs) currently contain a vast amount of structured information in the form of entities and relations. Because KGs are often constructed automatically by means of information extraction processes, they may miss information that was either not present in the original source or not successfully extracted. As a result, KGs might lack useful and valuable information. Current approaches that aim to complete missing information in KGs have two main drawbacks. First, some have a dependence on embedded representations, which impose a very expensive preprocessing step and need to be recomputed again as the KG grows. Second, others are based on long random paths that may not cover all relevant information, whereas exhaustively analyzing all possible paths between entities is very time-consuming. In this paper, we present an approach to complete KGs based on evaluating candidate triples using a set of neighborhood-based features. Our approach exploits the highly connected nature of KGs by analyzing the entities and relations surrounding any given pair of entities, while avoiding full recomputations as new entities are added. Our results indicate that our proposal is able to identify correct triples with a higher effectiveness than other state-of-the-art approaches, achieving higher average F1 scores in all tested datasets. Therefore, we conclude that the information present in the vicinities of the two entities within a candidate triple can be leveraged to determine whether that triple is missing from the KG or not.



中文翻译:

CAFE:使用邻域感知功能完成知识图

知识图谱(KGs)当前包含大量以实体和关系形式的结构化信息。由于KG通常是通过信息提取过程自动构建的,因此它们可能会丢失原始源中不存在或未成功提取的信息。结果,幼稚园可能缺乏有用和有价值的信息。旨在完成KG中丢失信息的当前方法有两个主要缺点。首先,有些依赖于嵌入式表示,这需要非常昂贵的预处理步骤,并且随着KG的增长需要再次重新计算。其次,其他方法是基于可能无法涵盖所有​​相关信息的长随机路径,而详尽地分析实体之间的所有可能路径非常耗时。在本文中,我们使用一套基于邻域的特征,通过评估候选三元组,提出了一种完成KG的方法。我们的方法通过分析实体和任何给定实体对周围的关系来利用KG的高度关联性质,同时避免在添加新实体时进行完全重新计算。我们的结果表明,我们的提案能够以比其他现有技术更高的效率来识别正确的三元组,从而在所有测试数据集中获得更高的平均F1分数。因此,我们得出结论,可以利用候选三元组中两个实体附近的信息来确定KG是否缺少该三元组。我们的方法通过分析实体和任何给定实体对周围的关系来利用KG的高度关联性质,同时避免在添加新实体时进行完全重新计算。我们的结果表明,我们的提案能够以比其他现有技术更高的效率来识别正确的三元组,从而在所有测试数据集中获得更高的平均F1分数。因此,我们得出结论,可以利用候选三元组中两个实体附近的信息来确定KG是否缺少该三元组。我们的方法通过分析实体和任何给定实体对周围的关系来利用KG的高度关联性质,同时避免在添加新实体时进行完全重新计算。我们的结果表明,我们的提案能够以比其他现有技术更高的效率来识别正确的三元组,从而在所有测试数据集中获得更高的平均F1分数。因此,我们得出结论,可以利用候选三元组中两个实体附近的信息来确定KG是否缺少该三元组。我们的结果表明,我们的提案能够以比其他现有技术更高的效率来识别正确的三元组,从而在所有测试数据集中获得更高的平均F1分数。因此,我们得出结论,可以利用候选三元组中两个实体附近的信息来确定KG是否缺少该三元组。我们的结果表明,我们的提案能够以比其他最新方法更高的效率识别正确的三元组,从而在所有测试数据集中获得更高的平均F1分数。因此,我们得出结论,可以利用候选三元组中两个实体附近的信息来确定KG是否缺少该三元组。

更新日期:2021-05-18
down
wechat
bug