Skip to main content
Log in

Natural language question answering over knowledge graph: the marriage of SPARQL query and keyword search

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Natural language question answering over knowledge graph has received widespread attention. However, the existing methods always aim to improve every phase of natural language question answering and neglect the defects; namely, not all query intentions can be identified and mapped to the correct SPARQL statement. In contrast, keyword search relies on the links among multiple keywords regardless of the exact logic relations in question. Therefore, we propose a framework (abbreviated as NLQSK for title of this paper) that introduces keyword search into natural language question answering to compensate for the defects mentioned above. First, we translate a natural language question into top-k SPARQL statements by using the existing methods. Second, we transform the valuable information that cannot be identified and mapped into keywords, and then, return the neighboring information in a knowledge graph by keyword index. Third, we combine the SPARQL block (i.e., the SPARQL statement and its result) and keyword search to produce the answer to the natural language question. Finally, the experiments on the benchmark dataset confirm that keyword search can compensate for the defects of natural language question answering and that NLQSK can answer more questions than the existing state-of-the-art question answering systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Amsterdamer Y, Kukliansky A, Milo T (2015) NL2CM: a natural language interface to crowd mining. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 1433–1438

  2. De Marneffe MC, Dozat T, Silveira N, Haverinen K, Ginter F, Nivre J, Manning CD (2014) Universal Stanford dependencies: a cross-linguistic typology. In: Proceedings of the international conference on language resources and evaluation (LREC), pp 4585–4592

  3. Diefenbach D, Singh K, Maret P (2018) WDAqua-core1: a question answering service for RDF knowledge bases. In: Proceedings of the international world wide web conferences (WWW), pp 1087–1091

  4. Dima C (2013) Intui2: a prototype system for question answering over linked data. In: Proceedings of the question answering over linked data lab (QALD-3) at CLEF, pp 1–12

  5. Dubey M, Dasgupta S, Sharma A, Hoffner K, Lehmann J (2016) AskNow: a framework for natural language query formalization in SPARQL. In: Proceedings of the international semantic web conference (ISWC), pp 300–316

  6. Elbassuoni S, Blanco R (2011) Keyword search over RDF graphs. In: Proceedings of the 20th ACM international conference on information and knowledge management (CIKM), pp 237–242

  7. Elbassuoni S, Ramanath M, Schenkel R, Weikum G (2010) Searching RDF graphs with SPARQL and keywords. IEEE Data Eng Bull 33:16–24

    Google Scholar 

  8. Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: Proceedings of the 2011 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 1535–1545

  9. Ferré S (2013) Squall2sparql: a translator from controlled English to full SPARQL 1.1. Work. Multilingual question answering over linked data (QALD-3)

  10. Fu H, Anyanwu K (2011) Effectively interpreting keyword queries on RDF databases with a rear view. In: Proceedings of the semantic web–ISWC, pp 193–208

  11. Gai L, Chen W, Wang T (2015) A partition-based summary-graph-driven method for efficient RDF query processing. arXiv:1510.07749

  12. Giannone C, Bellomaria V, Basili R (2013) A HMM-based approach to question answering against linked data. In: Proceedings of the question answering over linked data lab (QALD-3) at CLEF, pp 1–12

  13. Gkirtzou K, Karozos K, Vassalos V (2015) Keywords-to-SPARQL translation for RDF data search and exploration. In: Proceedings of the international conference on theory and practice of digital libraries (TPDL), pp 111–123

  14. He S, Zhang Y, Liu K, Zhao J (2014) CASIA@V2: a MLN-based question answering system over linked data. In: Proceedings of the question answering over linked data (QALD-4), pp 1–11

  15. Hu X, Dang D, Yao Y, Ye L (2018) Natural language aggregate query over RDF data. Inf Sci 454:363–381

    Article  MathSciNet  Google Scholar 

  16. Hu X, Duan J, Dang D (2019) Crowdsourcing-based semantic relation recognition for natural language questions over RDF data. Enterp Inf Syst 13:935–958

    Article  Google Scholar 

  17. Hu S, Zou L, Yu JX, Wang H, Zhao D (2018) Answering natural language questions by subgraph matching over knowledge graphs. IEEE Trans Knowl Data Eng 30:824–837

    Article  Google Scholar 

  18. Joris G, Ferré S (2013) Scalewelis: a scalable query-based faceted search system on top of SPARQL endpoints. In: Proceedings of the work multilingual question answering over linked data (QALD-3), pp 1–5

  19. Ladwig G, Tran T (2010) Combining query translation with query answering for efficient keyword search. In: Proceedings of the extended semantic web conference (ESWC), pp 288–303

  20. Le W, Li F, Kementsietsidis A, Duan S (2014) Scalable keyword search on large RDF data. IEEE Trans Knowl Data Eng 26:2774–2788

    Article  Google Scholar 

  21. Lian X, Chen L, Huang Z (2015) Keyword search over probabilistic RDF graphs. IEEE Trans Knowl Data Eng 27:1246–1260

    Article  Google Scholar 

  22. Liu J, Li W, Luo L, Zhou J, Han X, Shi J (2017) Linked open data query based on natural language. Chin J Electron 26:230–235

    Article  Google Scholar 

  23. Mazzeo GM, Zaniolo C (2016) Answering controlled natural language questions on RDF knowledge bases. In: Proceedings of the 19th international conference on extending database technology (EDBT), pp 608–611

  24. Mervin R, Murugesh S, Jaya DA (2016) Representing natural language sentences in RDF graph and discourse representation for ontology mapping. Int J Appl Eng Res 11:632–635

    Google Scholar 

  25. Nakashole N, Weikum G, Suchanek F (2012) PATTY: a taxonomy of relational patterns with semantic types. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 1135–1145

  26. Nakashole N, Weikum G, Suchanek F (2012) Discovering and exploring relations on the web. VLDB Endowment 5(12):1982–1985

    Article  Google Scholar 

  27. Nakashole N, Weikum G, Suchanek F (2013) Discovering semantic relations from the web and organizing them with PATTY. ACM SIGMOD Rec 42(2):29–34

    Article  Google Scholar 

  28. Peng P, Zou L, Qin Z (2017) Answering top-k query combined keywords and structural queries on RDF graphs. Inf Syst 67:19–35

    Article  Google Scholar 

  29. Peng P, Zou L, Zhao D (2015) On the marriage of SPARQL and keywords. In: Proceedings of the Asia-Pacific web conference (APWeb), pp 3–16

  30. Pradel C, Haemmerl´e O, Hernandez N (2012) A semantic web interface using patterns: the SWIP system. In: Proceedings of the graph structures for knowledge representation and reasoning, pp 172–187

  31. Ratinov L, Roth D, Downey D, Anderson M (2011) Local and global algorithms for disambiguation to Wikipedia. In: Proceedings of the 49th annual meeting of the association for computational linguistics (ACL), pp 1375–1384

  32. Rivero CR, Hernnández I, Ruiz D, Corchuelo R (2016) Mapping RDF knowledge bases using exchange samples. Known Based Syst 93:47–66

    Article  Google Scholar 

  33. Rozinajová V, Macko P (2016) Using natural language to search linked data. In: Proceedings of the semantic keyword-based search on structured data sources, pp 179–189

  34. Schuster S, Manning CD (2016) Enhanced English universal dependencies: an improved representation for natural language understanding tasks. In: Proceedings of the international conference on language resources and evaluation (LREC), pp 23–28

  35. Shekarpour S, Marx E, Auer S, Sheth A (2017) RQUERY: Rewriting natural language queries on knowledge graphs to alleviate the vocabulary mismatch problem. In: Proceedings of the thirty-first AAAI conference on artificial intelligence (AAAI), pp 3936–3943

  36. Tran T, Wang H, Rudolph S (2009) Top-k exploration of query candidates for efficient keyword search on graph-shaped (RDF) data. In: Proceedings of the IEEE 25th international conference on data engineering (ICDE), pp 405–416

  37. Usbeck R, Ngomo A C N, Haarmann B, Krithara A, Röder M, Napolitano G (2017) 7th open challenge on question answering over linked data (QALD-7). Semantic web evaluation challenge, pp 59–69

  38. Unger C, Bühmann L, Lehmann J (2012) Template-based question answering over RDF data. In: Proceedings of the 21st international conference on world wide web (WWW), pp 639–648

  39. Yahya M, Berberich K, Elbassuoni S (2012) Deep answers for naturally asked questions on the web of data. In: Proceedings of the 21st international conference on world wide web (WWW), pp 445–449

  40. Yahya M, Berberich K, Elbassuoni S (2012) Natural language questions for the web of data. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 379–390

  41. Yahya M, Berberich K, Elbassuoni S (2013) Robust question answering over the web of linked data. In: Proceedings of the 22th ACM international conference on information and knowledge management (CIKM), pp 1107–1116

  42. Yahya M (2016) Question answering and query processing for extended knowledge graphs. PhD thesis

  43. Yang M, Ding B, Chaudhuri S, Chakrabarti K (2014) Finding patterns in a knowledge base using keywords to compose table answers. Proc VLDB Endow 7:1809–1820

    Article  Google Scholar 

  44. Zheng W, Zou L, Lian X (2015) How to build templates for RDF question/answering: an uncertain graph similarity join approach. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp 1809–1824

  45. Zou L, Huang R, Wang H (2014) Natural language question answering over RDF: a graph data driven approach. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, pp 313–324

Download references

Acknowledgements

This work was supported by the youth Project of science and technology research program of Chongqing Education Commission of China (No. KJQN201901414 and No. KJQN201901408), the Startup Foundation for Introducing Talent of Yangtze Normal University (No. 0107/011160052), the PhD Candidate Talent Development Project (No. BYJS201908), the Project of Chongqing Natural Science Foundation (No. cstc2019jcyj-msxmX0683 and No. cstc2019jcyj-msxm1579), major Project of science and technology research program of Chongqing Education Commission of China (No. KJZD-M201901401), the National Natural Science Foundation of China (Grant No. 61672102 and No. 61802244), the Program for New Century Excellent Talents in University of Ministry of Education of China (Grant No. NCET-10–0239) and the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2019JQ-668).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiangli Duan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, X., Duan, J. & Dang, D. Natural language question answering over knowledge graph: the marriage of SPARQL query and keyword search. Knowl Inf Syst 63, 819–844 (2021). https://doi.org/10.1007/s10115-020-01534-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-020-01534-4

Keywords

Navigation