PADA: A Prompt-based Autoregressive Approach for Adaptation to Unseen Domains,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PADA: A Prompt-based Autoregressive Approach for Adaptation to Unseen Domains
arXiv - CS - Computation and Language Pub Date : 2021-02-24 , DOI: arxiv-2102.12206
Eyal Ben-David, Nadav Oved, Roi Reichart

Natural Language Processing algorithms have made incredible progress recently, but they still struggle when applied to out-of-distribution examples. In this paper, we address a very challenging and previously underexplored version of this domain adaptation problem. In our setup an algorithm is trained on several source domains, and then applied to examples from an unseen domain that is unknown at training time. Particularly, no examples, labeled or unlabeled, or any other knowledge about the target domain are available to the algorithm at training time. We present PADA: A Prompt-based Autoregressive Domain Adaptation algorithm, based on the T5 model. Given a test example, PADA first generates a unique prompt and then, conditioned on this prompt, labels the example with respect to the NLP task. The prompt is a sequence of unrestricted length, consisting of pre-defined Domain Related Features (DRFs) that characterize each of the source domains. Intuitively, the prompt is a unique signature that maps the test example to the semantic space spanned by the source domains. In experiments with two tasks: Rumour Detection and Multi-Genre Natural Language Inference (MNLI), for a total of 10 multi-source adaptation scenarios, PADA strongly outperforms state-of-the-art approaches and additional strong baselines.

中文翻译：

PADA：一种针对瞬态域的基于提示的自回归方法

自然语言处理算法最近取得了令人难以置信的进步，但是当应用于分发示例时，它们仍然很挣扎。在本文中，我们解决了这一领域适应性问题的一个非常具有挑战性且以前尚未开发的版本。在我们的设置中，在几个源域上对算法进行了训练，然后将其应用于训练时未知的未知域中的示例。特别地，在训练时，没有标记的或未标记的示例，或有关目标域的任何其他知识均不可用于该算法。我们介绍PADA：一种基于提示的自回归域自适应算法，基于T5模型。给定一个测试示例，PADA首先生成一个唯一的提示，然后以该提示为条件，针对NLP任务标记该示例。提示是长度不受限制的序列，由预定义的域相关功能（DRF）组成，这些功能表征每个源域。直观上，提示是一个唯一的签名，它将测试示例映射到源域所跨越的语义空间。在两项任务的实验中：谣言检测和多体态自然语言推理（MNLI），对于总共10种多源适应方案，PADA的性能明显优于最新方法和其他强大的基准。

更新日期：2021-02-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>