当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FAR-ASS: Fact-aware reinforced abstractive sentence summarization
Information Processing & Management ( IF 8.6 ) Pub Date : 2021-01-19 , DOI: 10.1016/j.ipm.2020.102478
Mengli Zhang , Gang Zhou , Wanting Yu , Wenfen Liu

Automatic summarization systems provide an effective solution to today's unprecedented growth of textual data. For real-world tasks, such as data mining and information retrieval, the factual correctness of generated summary is critical. However, existing models usually focus on improving the informativeness rather than optimizing factual correctness. In this work, we present a Fact-Aware Reinforced Abstractive Sentence Summarization framework to improve the factual correctness of neural abstractive summarization models, denoted as FAR-ASS. Specifically, we develop an automatic fact extraction scheme leveraging OpenIE (Open Information Extraction) and dependency parser tools to extract structured fact tuples. Then, to quantitatively evaluate the factual correctness, we define a factual correctness score function that considers the factual accuracy and factual redundancy. We further propose to adopt reinforcement learning to improve readability and factual correctness by jointly optimizing a mixed-objective learning function. We use the English Gigaword and DUC 2004 datasets to evaluate our model. Experimental results show that compared with competitive models, our model significantly improves the factual correctness and readability of generated summaries, and also reduces duplicates while improving the informativeness.



中文翻译:

FAR-ASS:事实意识增强的抽象句子摘要

自动摘要系统为当今文本数据的空前增长提供了有效的解决方案。对于诸如数据挖掘和信息检索之类的实际任务,所生成摘要的事实正确性至关重要。但是,现有模型通常着重于提高信息性,而不是优化事实正确性。在这项工作中,我们提出了一个事实意识增强的抽象句摘要框架,以提高神经抽象摘要模型(称为FAR-ASS)的事实正确性。具体来说,我们开发了一种利用OpenIE(开放信息提取)和依赖解析器工具提取结构化事实元组的自动事实提取方案。然后,为了定量评估事实的正确性,我们定义了一个事实正确性评分函数,该函数考虑了事实准确性和事实冗余。我们进一步建议采用强化学习,通过共同优化混合目标学习功能来提高可读性和事实正确性。我们使用英语Gigaword和DUC 2004数据集来评估我们的模型。实验结果表明,与竞争模型相比,我们的模型显着提高了生成摘要的事实正确性和可读性,并且在提高信息量的同时减少了重复项。

更新日期:2021-01-19
down
wechat
bug