当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transformers: "The End of History" for NLP?
arXiv - CS - Information Retrieval Pub Date : 2021-04-09 , DOI: arxiv-2105.00813
Anton Chernyavskiy, Dmitry Ilvovsky, Preslav Nakov

Recent advances in neural architectures, such as the Transformer, coupled with the emergence of large-scale pre-trained models such as BERT, have revolutionized the field of Natural Language Processing (NLP), pushing the state-of-the-art for a number of NLP tasks. A rich family of variations of these models has been proposed, such as RoBERTa, ALBERT, and XLNet, but fundamentally, they all remain limited in their ability to model certain kinds of information, and they cannot cope with certain information sources, which was easy for pre-existing models. Thus, here we aim to shed some light on some important theoretical limitations of pre-trained BERT-style models that are inherent in the general Transformer architecture. First, we demonstrate in practice on two general types of tasks -- segmentation and segment labeling -- and four datasets that these limitations are indeed harmful and that addressing them, even in some very simple and naive ways, can yield sizable improvements over vanilla RoBERTa and XLNet. Then, we offer a more general discussion on desiderata for future additions to the Transformer architecture that would increase its expressiveness, which we hope could help in the design of the next generation of deep NLP architectures.

中文翻译:

变形金刚:NLP的“历史终结”?

神经体系结构(例如Transformer)的最新进展,以及大规模预训练模型(例如BERT)的出现,彻底改变了自然语言处理(NLP)的领域,将最新技术推向了新的高度。 NLP任务数。已经提出了这些模型的丰富变体系列,例如RoBERTa,ALBERT和XLNet,但是从根本上说,它们在建模某些类型信息方面的能力仍然受到限制,并且它们无法应对某些信息源,这很容易适用于先前存在的模型。因此,在这里我们旨在阐明一般Transformer体系结构中固有的预训练BERT样式模型的一些重要理论限制。第一的,我们在实践中展示了两种通用的任务类型-细分和细分标签-以及四个数据集,这些限制确实是有害的,并且即使以某些非常简单和幼稚的方式解决这些限制,也可以比香草RoBERTa和XLNet产生可观的改进。然后,我们将对desiderata进行更一般的讨论,以期将来对Transformer体系结构进行增添,以增加其表达能力,希望对下一代深度NLP体系结构的设计有所帮助。
更新日期:2021-05-04
down
wechat
bug