Summaformers @ LaySumm 20, LongSumm 20,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Summaformers @ LaySumm 20, LongSumm 20
arXiv - CS - Information Retrieval Pub Date : 2021-01-10 , DOI: arxiv-2101.03553
Sayar Ghosh Roy, Nikhil Pinnaparaju, Risubh Jain, Manish Gupta, Vasudeva Varma

Automatic text summarization has been widely studied as an important task in natural language processing. Traditionally, various feature engineering and machine learning based systems have been proposed for extractive as well as abstractive text summarization. Recently, deep learning based, specifically Transformer-based systems have been immensely popular. Summarization is a cognitively challenging task - extracting summary worthy sentences is laborious, and expressing semantics in brief when doing abstractive summarization is complicated. In this paper, we specifically look at the problem of summarizing scientific research papers from multiple domains. We differentiate between two types of summaries, namely, (a) LaySumm: A very short summary that captures the essence of the research paper in layman terms restricting overtly specific technical jargon and (b) LongSumm: A much longer detailed summary aimed at providing specific insights into various ideas touched upon in the paper. While leveraging latest Transformer-based models, our systems are simple, intuitive and based on how specific paper sections contribute to human summaries of the two types described above. Evaluations against gold standard summaries using ROUGE metrics prove the effectiveness of our approach. On blind test corpora, our system ranks first and third for the LongSumm and LaySumm tasks respectively.

中文翻译：

加法器@ LaySumm 20，LongSumm 20

自动文本摘要已被广泛研究为自然语言处理中的重要任务。传统上，已经提出了各种基于特征工程和基于机器学习的系统来进行提取文本和摘要文本摘要。最近，基于深度学习，特别是基于变压器的系统已经非常流行。摘要是一项具有挑战性的认知任务-提取值得摘要的句子很费力，而在进行摘要摘要时要简短地表达语义。在本文中，我们专门研究了汇总来自多个领域的科学研究论文的问题。我们区分两种摘要，即（a）LaySumm：一个很短的摘要，以通俗易懂的语气抓住了研究论文的本质，从而限制了公开的特定技术术语和（b）LongSumm：更长的详细摘要，旨在针对本文所涉及的各种观点提供具体见解。在利用最新的基于Transformer的模型的同时，我们的系统简单，直观，并基于特定纸质部分如何对上述两种类型的人类摘要进行了贡献。使用ROUGE度量标准对黄金标准摘要进行的评估证明了我们方法的有效性。在盲测语料库上，我们的系统在LongSumm和LaySumm任务上分别排名第一和第三。我们的系统简单，直观，并且基于特定的论文部分对上述两种类型的人类摘要的贡献。使用ROUGE度量标准对黄金标准摘要进行的评估证明了我们方法的有效性。在盲测语料库上，我们的系统在LongSumm和LaySumm任务上分别排名第一和第三。我们的系统简单，直观，并且基于特定的论文部分对上述两种类型的人类摘要的贡献。使用ROUGE度量标准对黄金标准摘要进行的评估证明了我们方法的有效性。在盲测语料库上，我们的系统在LongSumm和LaySumm任务上分别排名第一和第三。

更新日期：2021-01-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文