Precise temporal slot filling via truth finding with data-driven commonsense,Knowledge and Information Systems

当前位置： X-MOL 学术 › Knowl. Inf. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Precise temporal slot filling via truth finding with data-driven commonsense
Knowledge and Information Systems ( IF 2.5 ) Pub Date : 2020-07-16 , DOI: 10.1007/s10115-020-01493-w
Xueying Wang , Meng Jiang

The task of temporal slot filling (TSF) is to extract values of specific attributes for a given entity, called “facts”, as well as temporal tags of the facts, from text data. While existing work denoted the temporal tags as single time slots, in this paper, we introduce and study the task of Precise TSF (PTSF), that is to fill two precise temporal slots including the beginning and ending time points. Based on our observation from a news corpus, most of the facts should have the two points, however, fewer than 0.1% of them have time expressions in the documents. On the other hand, the documents’ post time, though often available, is not as precise as the time expressions of being the time a fact was valid. Therefore, directly decomposing the time expressions or using an arbitrary post-time period cannot provide accurate results for PTSF. The challenge of PTSF lies in finding precise time tags in noisy and incomplete temporal contexts in the text. To address the challenge, we propose an unsupervised approach based on the philosophy of truth finding. The approach has two modules that mutually enhance each other: One is a reliability estimator of fact extractors conditionally on the temporal contexts; the other is a fact trustworthiness estimator based on the extractor’s reliability. Commonsense knowledge (e.g., one country has only one president at a specific time) was automatically generated from data and used for inferring false claims based on trustworthy facts. For the purpose of evaluation, we manually collect hundreds of temporal facts from Wikipedia as ground truth, including country’s presidential terms and sport team’s player career history. Experiments on a large news dataset demonstrate the accuracy and efficiency of our proposed algorithm.

中文翻译：

借助数据驱动常识，通过真相查找来精确地填充时间空档

临时空缺填充（TSF）的任务是从文本数据中提取给定实体（称为“事实”）的特定属性的值以及事实的临时标签。尽管现有工作将时间标记表示为单个时隙，但在本文中，我们介绍并研究了精确TSF（PTSF）的任务，即填充两个精确的时间间隙，包括开始和结束时间点。根据我们对新闻语料库的观察，大多数事实都应该包含两点，但是，少于0.1％的事实在文档中有时间表达。另一方面，文档的发布时间虽然经常可用，但并不像事实有效的时间表达那样精确。因此，直接分解时间表达式或使用任意后置时间段不能为PTSF提供准确的结果。PTSF的挑战在于在嘈杂和不完整的时间语境中找到精确的时间标签。为了应对这一挑战，我们提出了一种基于真相发现哲学的无监督方法。该方法有两个相互促进的模块：一个是条件提取器在时间上下文上的可靠性估计器；第二个是条件提取器。另一个是基于提取器可靠性的事实可信度估计器。常识性知识（例如，一个国家在特定时间只有一位总统）是根据数据自动生成的，并用于基于可信赖的事实推断虚假主张。为了进行评估，我们从Wikipedia手动收集了数百个时事事实作为基础事实，包括该国的总统任期和运动队的球员职业经历。

更新日期：2020-07-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11