EAGER: Embedding-Assisted Entity Resolution for Knowledge Graphs,arXiv - CS - Databases

当前位置： X-MOL 学术 › arXiv.cs.DB › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

EAGER: Embedding-Assisted Entity Resolution for Knowledge Graphs
arXiv - CS - Databases Pub Date : 2021-01-15 , DOI: arxiv-2101.06126
Daniel Obraczka, Jonathan Schuchart, Erhard Rahm

Entity Resolution (ER) is a constitutional part for integrating different knowledge graphs in order to identify entities referring to the same real-world object. A promising approach is the use of graph embeddings for ER in order to determine the similarity of entities based on the similarity of their graph neighborhood. The similarity computations for such embeddings translates to calculating the distance between them in the embedding space which is comparatively simple. However, previous work has shown that the use of graph embeddings alone is not sufficient to achieve high ER quality. We therefore propose a more comprehensive ER approach for knowledge graphs called EAGER (Embedding-Assisted Knowledge Graph Entity Resolution) to flexibly utilize both the similarity of graph embeddings and attribute values within a supervised machine learning approach. We evaluate our approach on 23 benchmark datasets with differently sized and structured knowledge graphs and use hypothesis tests to ensure statistical significance of our results. Furthermore we compare our approach with state-of-the-art ER solutions, where our approach yields competitive results for table-oriented ER problems and shallow knowledge graphs but much better results for deeper knowledge graphs.

中文翻译：

EAGER：知识图的嵌入辅助实体解析

实体解析（ER）是用于整合不同知识图谱的组成部分，以便标识引用同一真实世界对象的实体。一种有前途的方法是将图形嵌入用于ER，以便基于实体的图邻域相似度来确定实体的相似度。对于这种嵌入的相似度计算转换为在嵌入空间中计算它们之间的距离，这比较简单。但是，先前的工作表明仅使用图形嵌入不足以实现较高的ER质量。因此，我们为知识图提出了一种更全面的ER方法，称为EAGER（嵌入式辅助知识图实体解析），以在监督的机器学习方法中灵活地利用图嵌入的相似性和属性值。我们对23种具有不同大小和结构化知识图的基准数据集评估了我们的方法，并使用假设检验来确保结果的统计意义。此外，我们将我们的方法与最新的ER解决方案进行了比较，在我们的方法中，针对面向表格的ER问题和浅层知识图，我们的方法产生了竞争性结果，而对于深层知识图，我们的方法产生了更好的结果。我们对23种具有不同大小和结构化知识图的基准数据集评估了我们的方法，并使用假设检验来确保结果的统计意义。此外，我们将我们的方法与最新的ER解决方案进行了比较，在我们的方法中，针对面向表格的ER问题和浅层知识图，我们的方法产生了竞争性结果，而对于深层知识图，我们的方法产生了更好的结果。我们对23种具有不同大小和结构化知识图的基准数据集评估了我们的方法，并使用假设检验来确保结果的统计意义。此外，我们将我们的方法与最新的ER解决方案进行了比较，在我们的方法中，针对面向表格的ER问题和浅层知识图，我们的方法产生了竞争性结果，而对于深层知识图，我们的方法产生了更好的结果。

更新日期：2021-01-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文