Data-driven assessment of structural evolution of RDF graphs,Semantic Web

当前位置： X-MOL 学术 › Semant. Web › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data-driven assessment of structural evolution of RDF graphs
Semantic Web ( IF 3.0 ) Pub Date : 2020-04-17 , DOI: 10.3233/sw-200368
Carlos Bobed _{1,

2} , Pierre Maillot ₃ , Peggy Cellier ₄ , Sébastien Ferré ₃

Affiliation

Since the birth of the Semantic Web, numerous knowledge bases have appeared. The applications that exploit them rely on the quality of their data through time. In this regard, one of the main dimensions of data quality is conformance to the expected usage of the vocabulary. However, the vocabularyusage (i.e., how classes and properties are actually populated) can vary from one base to another. Moreover, through time, such usage can evolve within a base and diverge from the previous practices. Methods have been proposed to follow the evolution of a knowledge base by the observation of the changes of their intentional schema (or ontology); however, they do not capture the evolution of their actual data, which can vary greatly in practice. In this paper, we propose a data-driven approach to assess the global evolution of vocabulary usage in large RDF graphs. Our proposal relies on two structural measures defined at different granularities (dataset vs update), which are based on pattern mining techniques. We have performed a thorough experimentation which shows that our approach is scalable, and can capture structural evolution through time of both synthetic (LUBM) and real knowledge bases (different snapshots and updates of DBpedia).

中文翻译：

数据驱动的RDF图结构演化评估

自语义网诞生以来，已经出现了许多知识库。利用它们的应用程序依赖于时间的数据质量。在这方面，数据质量的主要方面之一是符合词汇表的预期用法。但是，词汇表（即类和属性的实际填充方式）可能因一个基础而异。而且，随着时间的流逝，这种用法可能会在一个基础上发展并与以前的实践有所不同。已经提出了通过观察其有意模式（或本体）的变化来跟踪知识库的发展的方法。但是，它们没有捕获实际数据的演变，实际情况可能会发生很大变化。在本文中，我们提出一种数据驱动的方法来评估大型RDF图中词汇使用量的全球演变。我们的建议依赖于以模式挖掘技术为基础，以不同的粒度（数据集与更新）定义的两种结构度量。我们进行了彻底的实验，表明我们的方法是可扩展的，并且可以通过合成（LUBM）和实际知识库（不同的快照和DBpedia的更新）的时间捕获结构的演变。

更新日期：2020-04-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11