Efficient SPARQL Autocompletion via SPARQL,arXiv - CS - Databases

当前位置： X-MOL 学术 › arXiv.cs.DB › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Efficient SPARQL Autocompletion via SPARQL
arXiv - CS - Databases Pub Date : 2021-04-29 , DOI: arxiv-2104.14595
Hannah Bast, Johannes Kalmbach, Theresa Klumpp, Florian Kramer, Niklas Schnelle

We show how to achieve fast autocompletion for SPARQL queries on very large knowledge bases. At any position in the body of a SPARQL query, the autocompletion suggests matching subjects, predicates, or objects. The suggestions are context-sensitive in the sense that they lead to a non-empty result and are ranked by their relevance to the part of the query already typed. The suggestions can be narrowed down by prefix search on the names and aliases of the desired subject, predicate, or object. All suggestions are themselves obtained via SPARQL queries, which we call autocompletion queries. For existing SPARQL engines, these queries are impractically slow on large knowledge bases. We present various algorithmic and engineering improvements of an existing SPARQL engine such that these autocompletion queries are executed efficiently. We provide an extensive evaluation of a variety of suggestion methods on three large knowledge bases, including Wikidata (6.9B triples). We explore the trade-off between the relevance of the suggestions and the processing time of the autocompletion queries. We compare our results with two widely used SPARQL engines, Virtuoso and Blazegraph. On Wikidata, we achieve fully sensitive suggestions with sub-second response times for over 90% of a large and diverse set of thousands of autocompletion queries. Materials for full reproducibility, an interactive evaluation web app, and a demo are available on: https://ad.informatik.uni-freiburg.de/publications .

中文翻译：

通过SPARQL进行高效的SPARQL自动补全

我们将展示如何在非常大的知识库上实现SPARQL查询的快速自动完成。在SPARQL查询主体中的任何位置，自动完成功能都会建议匹配的主题，谓词或对象。这些建议是上下文相关的，从某种意义上来说，它们会导致非空结果，并根据它们与已经键入的查询部分的相关性对其进行排名。可以通过对所需主题，谓词或宾语的名称和别名进行前缀搜索来缩小建议的范围。所有建议本身都是通过SPARQL查询（我们称为自动完成查询）获得的。对于现有的SPARQL引擎，在大型知识库上，这些查询的速度实在不切实际。我们介绍了现有SPARQL引擎的各种算法和工程改进，以使这些自动完成查询得到有效执行。我们在包括Wikidata（6.9B三元组）在内的三个大型知识库上对各种建议方法进行了广泛的评估。我们探讨了建议的相关性和自动完成查询的处理时间之间的权衡。我们将结果与两种广泛使用的SPARQL引擎（Virtuoso和Blazegraph）进行比较。在Wikidata上，对于大型多样的数千个自动完成查询中的90％以上，我们可以在亚秒级的响应时间内获得完全敏感的建议。有关完全可复制的材料，交互式评估Web应用程序和演示，请访问：https：//ad.informatik.uni-freiburg.de/publications。我们探讨了建议的相关性和自动完成查询的处理时间之间的权衡。我们将结果与两种广泛使用的SPARQL引擎（Virtuoso和Blazegraph）进行比较。在Wikidata上，对于大型多样的数千个自动完成查询中的90％以上，我们可以在亚秒级的响应时间内获得完全敏感的建议。有关完全可复制的材料，交互式评估Web应用程序和演示，请访问：https：//ad.informatik.uni-freiburg.de/publications。我们探讨了建议的相关性和自动完成查询的处理时间之间的权衡。我们将结果与两种广泛使用的SPARQL引擎（Virtuoso和Blazegraph）进行比较。在Wikidata上，对于大型多样的数千个自动完成查询中的90％以上，我们可以在亚秒级的响应时间内获得完全敏感的建议。有关完全可复制的材料，交互式评估Web应用程序和演示，请访问：https：//ad.informatik.uni-freiburg.de/publications。

更新日期：2021-05-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文