当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Querying knowledge graphs in natural language
Journal of Big Data ( IF 8.1 ) Pub Date : 2021-01-06 , DOI: 10.1186/s40537-020-00383-w
Shiqi Liang , Kurt Stockinger , Tarcisio Mendes de Farias , Maria Anisimova , Manuel Gil

Knowledge graphs are a powerful concept for querying large amounts of data. These knowledge graphs are typically enormous and are often not easily accessible to end-users because they require specialized knowledge in query languages such as SPARQL. Moreover, end-users need a deep understanding of the structure of the underlying data models often based on the Resource Description Framework (RDF). This drawback has led to the development of Question-Answering (QA) systems that enable end-users to express their information needs in natural language. While existing systems simplify user access, there is still room for improvement in the accuracy of these systems. In this paper we propose a new QA system for translating natural language questions into SPARQL queries. The key idea is to break up the translation process into 5 smaller, more manageable sub-tasks and use ensemble machine learning methods as well as Tree-LSTM-based neural network models to automatically learn and translate a natural language question into a SPARQL query. The performance of our proposed QA system is empirically evaluated using the two renowned benchmarks-the 7th Question Answering over Linked Data Challenge (QALD-7) and the Large-Scale Complex Question Answering Dataset (LC-QuAD). Experimental results show that our QA system outperforms the state-of-art systems by 15% on the QALD-7 dataset and by 48% on the LC-QuAD dataset, respectively. In addition, we make our source code available.



中文翻译:

用自然语言查询知识图

知识图是用于查询大量数据的强大概念。这些知识图通常非常庞大,最终用户通常不容易访问它们,因为它们需要查询语言(例如SPARQL)中的专门知识。此外,最终用户通常需要基于资源描述框架(RDF)对底层数据模型的结构进行深入了解。此缺点导致开发了问答系统(QA),该系统使最终用户能够以自然语言表达其信息需求。尽管现有系统简化了用户访问,但是这些系统的准确性仍有提高的空间。在本文中,我们提出了一种新的QA系统,用于将自然语言问题转换为SPARQL查询。关键思想是将翻译过程分成5个较小的部分,更多可管理的子任务,并使用集成机器学习方法以及基于Tree-LSTM的神经网络模型来自动学习并将自然语言问题转换为SPARQL查询。我们使用两个著名的基准对我们提出的质量保证系统的性能进行了经验评估,即第七个链接数据挑战问答系统(QALD-7)和大型复杂问答数据集(LC-QuAD)。实验结果表明,在QALD-7数据集和LC-QuAD数据集上,我们的QA系统的性能分别比最先进的系统高出15%和48%。另外,我们提供源代码。我们使用两个著名的基准对我们提出的质量保证系统的性能进行了经验评估,即第七个链接数据挑战问答系统(QALD-7)和大型复杂问答数据集(LC-QuAD)。实验结果表明,在QALD-7数据集和LC-QuAD数据集上,我们的QA系统的性能分别比最先进的系统高出15%和48%。另外,我们提供源代码。我们使用两个著名的基准对我们提出的质量保证系统的性能进行了经验评估,即第七个链接数据挑战问答系统(QALD-7)和大型复杂问答数据集(LC-QuAD)。实验结果表明,在QALD-7数据集和LC-QuAD数据集上,我们的QA系统的性能分别比最先进的系统高出15%和48%。另外,我们提供源代码。

更新日期:2021-01-07
down
wechat
bug