Skip to main content
Log in

Precise temporal slot filling via truth finding with data-driven commonsense

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The task of temporal slot filling (TSF) is to extract values of specific attributes for a given entity, called “facts”, as well as temporal tags of the facts, from text data. While existing work denoted the temporal tags as single time slots, in this paper, we introduce and study the task of Precise TSF (PTSF), that is to fill two precise temporal slots including the beginning and ending time points. Based on our observation from a news corpus, most of the facts should have the two points, however, fewer than 0.1% of them have time expressions in the documents. On the other hand, the documents’ post time, though often available, is not as precise as the time expressions of being the time a fact was valid. Therefore, directly decomposing the time expressions or using an arbitrary post-time period cannot provide accurate results for PTSF. The challenge of PTSF lies in finding precise time tags in noisy and incomplete temporal contexts in the text. To address the challenge, we propose an unsupervised approach based on the philosophy of truth finding. The approach has two modules that mutually enhance each other: One is a reliability estimator of fact extractors conditionally on the temporal contexts; the other is a fact trustworthiness estimator based on the extractor’s reliability. Commonsense knowledge (e.g., one country has only one president at a specific time) was automatically generated from data and used for inferring false claims based on trustworthy facts. For the purpose of evaluation, we manually collect hundreds of temporal facts from Wikipedia as ground truth, including country’s presidential terms and sport team’s player career history. Experiments on a large news dataset demonstrate the accuracy and efficiency of our proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. If a president has multiple terms of office, multiple tuples of the same country, the same president name and different valid time periods are expected.

References

  1. Angeli G, Premkumar MJJ, Manning CD (2015) Leveraging linguistic structure for open domain information extraction, In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), vol 1, pp 344–354

  2. Banko M, Cafarella MJ, Soderland S, Broadhead M, Etzioni O (2007) Open information extraction from the web. In: ‘IJCAI’, vol 7, pp 2670–2676

  3. Berti-Equille L (2015) Data veracity estimation with ensembling truth discovery methods. In: 2015 IEEE international conference on big data (big data). IEEE, pp 2628–2636

  4. Chekol MW (2017) Scaling probabilistic temporal query evaluation. In: Proceedings of the 2017 ACM on conference on information and knowledge management. ACM, pp 697–706

  5. Culotta A, Sorensen J (2004) Dependency tree kernels for relation extraction. In: Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics, p 423

  6. Dligach D, Miller T, Lin C, Bethard S, Savova G (2017) Neural temporal relation extraction. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 2, Short Papers, pp 746–751

  7. Dong XL, Berti-Equille L, Srivastava D (2009) Integrating conflicting data: the role of source dependence. Proc VLDB Endow 2(1):550–561

    Article  Google Scholar 

  8. Etzioni O, Fader A, Christensen J, Soderland S, Mausam M (2011) Open information extraction: the second generation. In: ‘IJCAI’, vol 11, pp 3–10

  9. Fundel K, Küffner R, Zimmer R (2006) Relexrelation extraction using dependency parse trees. Bioinformatics 23(3):365–371

    Article  Google Scholar 

  10. Galland A, Abiteboul S, Marian A, Senellart P (2010) Corroborating information from disagreeing views. In: Proceedings of the third ACM international conference on Web search and data mining. ACM, pp 131–140

  11. Gashteovski K, Gemulla R, Del Corro L (2017) Minie: minimizing facts in open information extraction. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2630–2640

  12. Goldman SA, Warmuth MK (1995) Learning binary relations using weighted majority voting. Mach Learn 20(3):245–271

    MATH  Google Scholar 

  13. Gupta R, Halevy A, Wang X, Whang SE, Wu F (2014) Biperpedia: an ontology for search applications. Proc VLDB Endow 7(7):505–516

    Article  Google Scholar 

  14. Halevy A, Noy N, Sarawagi S, Whang SE, Yu X (2016) Discovering structure in the universe of attribute names. In: Proceedings of the 25th international conference on world wide web, international world wide web conferences steering committee, pp 939–949

  15. Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266

    Article  MathSciNet  Google Scholar 

  16. Hoang-Vu T-A, Vo HT, Freire J (2016) A unified index for spatio-temporal keyword queries. In: Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, pp 135–144

  17. Jiang M, Shang J, Cassidy T, Ren X, Kaplan LM, Hanratty TP, Han J (2017) Metapad: Meta pattern discovery from massive text corpora. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 877–886

  18. Li Q, Jiang M, Zhang X, Qu M, Hanratty TP, Gao J, Han J (2018) Truepie: discovering reliable patterns in pattern-based information extraction. In: ‘Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1675–1684

  19. Li Q, Li Y, Gao J, Su L, Zhao B, Demirbas M, Fan W, Han J (2014) A confidence-aware approach for truth discovery on long-tail data. Proc VLDB Endow 8(4):425–436

    Article  Google Scholar 

  20. Li Q, Li Y, Gao J, Zhao B, Fan W, Han J (2014) Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data. ACM, pp 1187–1198

  21. Li X, Meng W, Clement TY (2016) Verification of fact statements with multiple truthful alternatives. In: ‘WEBIST (2)’, pp 87–97

  22. Li X, Meng W, Yu C (2011) T-verifier: verifying truthfulness of fact statements. In: 2011 IEEE 27th international conference on data engineering (ICDE). IEEE, pp 63–74

  23. Li Y, Gao J, Meng C, Li Q, Su L, Zhao B, Fan W, Han J (2016) A survey on truth discovery. ACM Sigkdd Explor Newsl 17(2):1–16

    Article  Google Scholar 

  24. Li Y, Li Q, Gao J, Su L, Zhao B, Fan W, Han J (2015) On the discovery of evolving truth. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 675–684

  25. Lin C, Miller T, Dligach D, Bethard S, Savova G (2017) Representations of time expressions for temporal relation extraction with convolutional neural networks. BioNLP 2017:322–327

    Google Scholar 

  26. Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: volume 2-volume 2. Association for Computational Linguistics, pp 1003–1011

  27. Nakashole N, Weikum G, Suchanek F (2012) Patty: a taxonomy of relational patterns with semantic types. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, pp 1135–1145

  28. Parker R, Graff D, Kong J, Chen K, Maeda K (2009) English gigaword fourth edition ldc2009t13. Linguistic Data Consortium, Philadelphia

    Google Scholar 

  29. Reimers N, Dehghani N, Gurevych I (2016) Temporal anchoring of events for the timebank corpus. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers)’, vol 1, pp 2195–2204

  30. Ren X, Wu Z, He W, Qu M, Voss CR, Ji H, Abdelzaher TF, Han J (2017) Cotype: joint extraction of typed entities and relations with knowledge bases. In: Proceedings of the 26th international conference on world wide web’, international world wide web conferences steering committee, pp 1015–1024

  31. Riedel S, Yao L, McCallum A, Marlin BM (2013) Relation extraction with matrix factorization and universal schemas. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 74–84

  32. Schmitz M, Bart R, Soderland S, Etzioni O et al. (2012) Open language learning for information extraction. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, pp 523–534

  33. Sil A, Cucerzan S-P (2014) Towards temporal scoping of relational facts based on wikipedia data. In: Proceedings of the eighteenth conference on computational natural language learning, pp 109–118

  34. Sobrino A, Puente C, Olivas JÁ (2017) Mining temporal causal relations in medical texts. In: International joint conference SOCO17-CISIS17-ICEUTE17 León, Spain, September 6–8, 2017, Proceeding. Springer, pp 449–460

  35. Strötgen J, Gertz M (2015) A baseline temporal tagger for all languages. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 541–547

  36. Tsurel D, Pelleg D, Guy I, Shahaf D (2017) Fun facts: Automatic trivia fact extraction from wikipedia. In: Proceedings of the tenth ACM international conference on web search and data mining. ACM, pp 345–354

  37. Vydiswaran V, Zhai C, Roth D (2011) Content-driven trust propagation framework. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 974–982

  38. Waguih DA, Berti-Equille L (2014) Truth discovery algorithms: an experimental evaluation. arXiv:1409.6428

  39. Wang D, Kaplan L, Le H, Abdelzaher T (2012) On truth discovery in social sensing: a maximum likelihood estimation approach. In: Proceedings of the 11th international conference on information processing in sensor networks. ACM, pp 233–244

  40. Xiao H, Gao J, Li Q, Ma F, Su L, Feng Y, Zhang A (2016) Towards confidence in the truth: a bootstrapping based truth discovery approach. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1935–1944

  41. Xiao H, Li Y, Gao J, Wang F, Ge L, Fan W, Vu LH, Turaga DS (2015) Believe it today or tomorrow? detecting untrustworthy information from dynamic multi-source data. In: Proceedings of the 2015 SIAM international conference on data mining. SIAM, pp 397–405

  42. Yahya M, Whang S, Gupta R, Halevy A (2014) Renoun: fact extraction for nominal attributes. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 325–335

  43. Yao L, Su L, Li Q, Li Y, Ma F, Gao J, Zhang A (2018) Online truth discovery on time series data. In: Proceedings of the 2018 SIAM international conference on data mining. SIAM, pp 162–170

  44. Yin X, Han J, Philip SY (2008) Truth discovery with multiple conflicting information providers on the web. IEEE Trans Knowl Data Eng 20(6):796–808

    Article  Google Scholar 

  45. Yin X, Tan W (2011) Semi-supervised truth discovery. In: Proceedings of the 20th international conference on world wide web. ACM, pp 217–226

  46. Zhao B, Rubinstein BI, Gemmell J, Han J (2012) A bayesian approach to discovering truth from conflicting sources for data integration. Proc VLDB Endow 5(6):550–561

    Article  Google Scholar 

  47. Zhi S, Yang F, Zhu Z, Li Q, Wang Z, Han J (2018) Dynamic truth discovery on numerical data. In: 2018 IEEE international conference on data mining (ICDM). IEEE, pp 817–826

Download references

Acknowledgements

Our research was supported by National Science Foundation IIS-1849816.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Jiang, M. Precise temporal slot filling via truth finding with data-driven commonsense. Knowl Inf Syst 62, 4113–4139 (2020). https://doi.org/10.1007/s10115-020-01493-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-020-01493-w

Keywords

Navigation