当前位置: X-MOL 学术Neural Comput. & Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Somun: entity-centric summarization incorporating pre-trained language models
Neural Computing and Applications ( IF 6 ) Pub Date : 2020-09-11 , DOI: 10.1007/s00521-020-05319-2
Emrah Inan

Text summarization resolves the issue of capturing essential information from a large volume of text data. Existing methods either depend on the end-to-end models or hand-crafted preprocessing steps. In this study, we propose an entity-centric summarization method which extracts named entities and produces a small graph with a dependency parser. To extract entities, we employ well-known pre-trained language models. After generating the graph, we perform the summarization by ranking entities using the harmonic centrality algorithm. Experiments illustrate that we outperform the state-of-the-art unsupervised learning baselines by improving the performance more than 10% for ROUGE-1 and more than 50% for ROUGE-2 scores. Moreover, we achieve comparable results to recent end-to-end models.



中文翻译:

Somun:以实体为中心的汇总,包含预训练的语言模型

文本摘要解决了从大量文本数据中捕获基本信息的问题。现有方法取决于端到端模型或手工制作的预处理步骤。在这项研究中,我们提出了一种以实体为中心的汇总方法,该方法提取命名的实体并生成一个带有依赖解析器的小图。为了提取实体,我们采用了众所周知的预训练语言模型。生成图形后,我们通过使用谐波中心度算法对实体进行排名来进行汇总。实验表明,通过将ROUGE-1的性能提高超过10%,将ROUGE-2的得分提高50%以上,我们的性能优于最新的无监督学习基准。而且,我们获得了与最新端到端模型可比的结果。

更新日期:2020-09-11
down
wechat
bug