当前位置: X-MOL 学术ACM Trans. Asian Low Resour. Lang. Inf. Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TransBERT
ACM Transactions on Asian and Low-Resource Language Information Processing ( IF 1.8 ) Pub Date : 2021-03-09 , DOI: 10.1145/3427669
Zhongyang Li 1 , Xiao Ding 1 , Ting Liu 1
Affiliation  

Recent advances, such as GPT, BERT, and RoBERTa, have shown success in incorporating a pre-trained transformer language model and fine-tuning operations to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story-ending prediction as the target task to conduct experiments. The final results of 96.0% and 95.0% accuracy on two versions of Story Cloze Test datasets dramatically outperform previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks to improve BERT. Furthermore, experiments on six English and three Chinese datasets show that TransBERT generalizes well to other tasks, languages, and pre-trained models.

中文翻译:

TransBERT

最近的进展,如 GPT、BERT 和 RoBERTa,在整合预训练的 Transformer 语言模型和微调操作以改进下游 NLP 系统方面取得了成功。然而,该框架在有效整合来自其他相关任务的监督知识方面仍然存在一些基本问题。在这项研究中,我们研究了一个可迁移的 BERT (TransBERT) 训练框架,它不仅可以从大规模未标记数据中迁移一般语言知识,还可以从各种语义相关的监督任务中迁移特定种类的知识,用于目标任务。特别是,我们建议利用三种迁移任务,包括自然语言推理、情感分类和下一步动作预测,在预训练模型的基础上进一步训练 BERT。这使模型能够为目标任务获得更好的初始化。我们以故事结局预测为目标任务进行实验。两个版本的故事完形填空测试数据集的最终结果分别为 96.0% 和 95.0% 的准确率,大大优于以前最先进的基线方法。几个对比实验就如何选择传输任务来改进 BERT 提供了一些有用的建议。此外,对六个英文和三个中文数据集的实验表明,TransBERT 可以很好地推广到其他任务、语言和预训练模型。几个对比实验就如何选择传输任务来改进 BERT 提供了一些有用的建议。此外,对六个英文和三个中文数据集的实验表明,TransBERT 可以很好地推广到其他任务、语言和预训练模型。几个对比实验就如何选择传输任务来改进 BERT 提供了一些有用的建议。此外,对六个英文和三个中文数据集的实验表明,TransBERT 可以很好地推广到其他任务、语言和预训练模型。
更新日期:2021-03-09
down
wechat
bug