当前位置: X-MOL 学术J. Inf. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning
Journal of Information Science ( IF 2.4 ) Pub Date : 2021-02-15 , DOI: 10.1177/0165551521990616
Salima Lamsiyah 1 , Abdelkader El Mahdaouy 2 , Saïd El Alaoui Ouatik 1 , Bernard Espinasse 3
Affiliation  

Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.



中文翻译:

基于BERT多任务微调转移学习的无监督提取多文档摘要方法

文本表示是影响几种文本摘要方法的有效性的基本基石。使用预训练的词嵌入模型进行的转移学习已显示出令人鼓舞的结果。但是,大多数这些表示形式都没有考虑句子中单词之间的顺序和语义关系,因此它们没有完整句子的含义。为了克服这个问题,目前的研究提出了一种基于BERT句子嵌入模型的转移学习的无监督提取多文档摘要的方法。此外,为了改进句子表示学习,我们使用单任务和多任务微调方法从GLUE基准数据集中微调了受监督中间任务的BERT模型。在标准DUC'2002–2004数据集上进行实验。所获得的结果表明,与最近的基于深度学习的最新方法相比,我们的方法明显优于几种基准方法,并且具有可比的,有时甚至更好的性能。此外,结果表明,使用多任务学习对BERT进行微调已大大提高了性能。

更新日期:2021-02-16
down
wechat
bug