RetrievalSum: A Retrieval Enhanced Framework for Abstractive Summarization,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

RetrievalSum: A Retrieval Enhanced Framework for Abstractive Summarization
arXiv - CS - Computation and Language Pub Date : 2021-09-16 , DOI: arxiv-2109.07943
Chenxin An, Ming Zhong, Zhichao Geng, Jianqiang Yang, Xipeng Qiu

Existing summarization systems mostly generate summaries purely relying on the content of the source document. However, even for humans, we usually need some references or exemplars to help us fully understand the source document and write summaries in a particular format. But how to find the high-quality exemplars and incorporate them into summarization systems is still challenging and worth exploring. In this paper, we propose RetrievalSum, a novel retrieval enhanced abstractive summarization framework consisting of a dense Retriever and a Summarizer. At first, several closely related exemplars are retrieved as supplementary input to help the generation model understand the text more comprehensively. Furthermore, retrieved exemplars can also play a role in guiding the model to capture the writing style of a specific corpus. We validate our method on a wide range of summarization datasets across multiple domains and two backbone models: BERT and BART. Results show that our framework obtains significant improvement by 1.38~4.66 in ROUGE-1 score when compared with the powerful pre-trained models, and achieve new state-of-the-art on BillSum. Human evaluation demonstrates that our retrieval enhanced model can better capture the domain-specific writing style.

中文翻译：

RetrievalSum：抽象摘要的检索增强框架

现有的摘要系统大多纯粹依赖于源文档的内容来生成摘要。但是，即使对于人类，我们通常也需要一些参考资料或范例来帮助我们充分理解源文档并以特定格式编写摘要。但是如何找到高质量的样本并将它们整合到摘要系统中仍然具有挑战性，值得探索。在本文中，我们提出了 RetrievalSum，一种新颖的检索增强抽象摘要框架，由一个密集检索器和一个总结器组成。首先，检索几个密切相关的样本作为补充输入，以帮助生成模型更全面地理解文本。此外，检索到的样本还可以在指导模型捕获特定语料库的写作风格方面发挥作用。我们在跨多个域和两个主干模型的广泛汇总数据集上验证了我们的方法：BERT 和 BART。结果表明，与强大的预训练模型相比，我们的框架在 ROUGE-1 得分上获得了 1.38~4.66 的显着提高，并在 BillSum 上实现了新的最新技术。人工评估表明，我们的检索增强模型可以更好地捕捉特定领域的写作风格。

更新日期：2021-09-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文