当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Neural Supervised Domain Adaptation by Augmenting Pre-trained Models with Random Units
arXiv - CS - Computation and Language Pub Date : 2021-06-09 , DOI: arxiv-2106.04935
Sara Meftah, Nasredine Semmar, Youssef Tamaazousti, Hassane Essafi, Fatiha Sadat

Neural Transfer Learning (TL) is becoming ubiquitous in Natural Language Processing (NLP), thanks to its high performance on many tasks, especially in low-resourced scenarios. Notably, TL is widely used for neural domain adaptation to transfer valuable knowledge from high-resource to low-resource domains. In the standard fine-tuning scheme of TL, a model is initially pre-trained on a source domain and subsequently fine-tuned on a target domain and, therefore, source and target domains are trained using the same architecture. In this paper, we show through interpretation methods that such scheme, despite its efficiency, is suffering from a main limitation. Indeed, although capable of adapting to new domains, pre-trained neurons struggle with learning certain patterns that are specific to the target domain. Moreover, we shed light on the hidden negative transfer occurring despite the high relatedness between source and target domains, which may mitigate the final gain brought by transfer learning. To address these problems, we propose to augment the pre-trained model with normalised, weighted and randomly initialised units that foster a better adaptation while maintaining the valuable source knowledge. We show that our approach exhibits significant improvements to the standard fine-tuning scheme for neural domain adaptation from the news domain to the social media domain on four NLP tasks: part-of-speech tagging, chunking, named entity recognition and morphosyntactic tagging.

中文翻译:

通过使用随机单元增强预训练模型进行神经监督域适应

神经迁移学习 (TL) 在自然语言处理 (NLP) 中变得无处不在,这要归功于它在许多任务上的高性能,尤其是在资源匮乏的情况下。值得注意的是,TL 被广泛用于神经领域适应,以将有价值的知识从高资源领域转移到低资源领域。在 TL 的标准微调方案中,模型最初在源域上进行预训练,随后在目标域上进行微调,因此,源域和目标域使用相同的架构进行训练。在本文中,我们通过解释方法表明,这种方案尽管效率高,但仍存在主要限制。事实上,虽然能够适应新的领域,但预训练的神经元在学习特定于目标领域的某些模式方面很挣扎。而且,尽管源域和目标域之间存在高度相关性,但我们阐明了发生的隐藏负迁移,这可能会减轻迁移学习带来的最终收益。为了解决这些问题,我们建议使用归一化、加权和随机初始化的单元来增强预训练模型,以在保持有价值的源知识的同时促进更好的适应。我们表明,我们的方法在四个 NLP 任务中对从新闻领域到社交媒体领域的神经领域适应的标准微调方案有显着改进:词性标注、分块、命名实体识别和形态句法标注。我们建议使用标准化、加权和随机初始化的单元来增强预训练模型,以促进更好的适应,同时保持有价值的源知识。我们表明,我们的方法在四个 NLP 任务中对从新闻领域到社交媒体领域的神经领域适应的标准微调方案有显着改进:词性标注、分块、命名实体识别和形态句法标注。我们建议使用标准化、加权和随机初始化的单元来增强预训练模型,以促进更好的适应,同时保持有价值的源知识。我们表明,我们的方法在四个 NLP 任务中对从新闻领域到社交媒体领域的神经领域适应的标准微调方案有显着改进:词性标注、分块、命名实体识别和形态句法标注。
更新日期:2021-06-10
down
wechat
bug