当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A condense-then-select strategy for text summarization
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2021-06-18 , DOI: 10.1016/j.knosys.2021.107235
Hou Pong Chan , Irwin King

Select-then-compress is a popular hybrid, framework for text summarization due to its high efficiency. This framework first selects salient sentences and then independently condenses each of the selected sentences into a concise version. However, compressing sentences separately ignores the context information of the document, and is therefore prone to delete salient information. To address this limitation, we propose a novel condense-then-select framework for text summarization. Our framework first concurrently condenses each document sentence. Original document sentences and their compressed versions then become the candidates for extraction. Finally, an extractor utilizes the context information of the document to select candidates and assembles them into a summary. If salient information is deleted during condensing, the extractor can select an original sentence to retain the information. Thus, our framework helps to avoid the loss of salient information, while preserving the high efficiency of sentence-level compression. Experiment results1 on the CNN/DailyMail, DUC-2002, and Pubmed datasets demonstrate that our framework outperforms the select-then-compress framework and other strong baselines.



中文翻译:

用于文本摘要的压缩然后选择策略

Select-then-compress 是一种流行的混合文本摘要框架,因为它的效率很高。该框架首先选择突出的句子,然后独立地将每个选定的句子压缩成一个简洁的版本。然而,单独压缩句子会忽略文档的上下文信息,因此容易删除显着信息。为了解决这个限制,我们提出了一种新颖的压缩然后选择文本摘要框架。我们的框架首先同时压缩每个文档句子。原始文档句子及其压缩版本然后成为提取的候选者。最后,提取器利用文档的上下文信息来选择候选并将它们组合成摘要。如果在浓缩过程中删除了显着信息,提取者可以选择一个原始句子来保留该信息。因此,我们的框架有助于避免显着信息的丢失,同时保持句子级压缩的高效率。在 CNN/DailyMail、DUC-2002 和 Pubmed 数据集上的实验结果1表明,我们的框架优于选择然后压缩框架和其他强大的基线。

更新日期:2021-06-23
down
wechat
bug