当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient Attribute Injection for Pretrained Language Models
arXiv - CS - Computation and Language Pub Date : 2021-09-16 , DOI: arxiv-2109.07953
Reinald Kim Amplayo, Kang Min Yoo, Sang-Woo Lee

Metadata attributes (e.g., user and product IDs from reviews) can be incorporated as additional inputs to neural-based NLP models, by modifying the architecture of the models, in order to improve their performance. Recent models however rely on pretrained language models (PLMs), where previously used techniques for attribute injection are either nontrivial or ineffective. In this paper, we propose a lightweight and memory-efficient method to inject attributes to PLMs. We extend adapters, i.e. tiny plug-in feed-forward modules, to include attributes both independently of or jointly with the text. To limit the increase of parameters especially when the attribute vocabulary is large, we use low-rank approximations and hypercomplex multiplications, significantly decreasing the total parameters. We also introduce training mechanisms to handle domains in which attributes can be multi-labeled or sparse. Extensive experiments and analyses on eight datasets from different domains show that our method outperforms previous attribute injection methods and achieves state-of-the-art performance on various datasets.

中文翻译:

预训练语言模型的高效属性注入

通过修改模型的架构,元数据属性(例如,来自评论的用户和产品 ID)可以作为附加输入合并到基于神经的 NLP 模型中,以提高其性能。然而,最近的模型依赖于预训练语言模型 (PLM),其中以前使用的属性注入技术要么不重要,要么无效。在本文中,我们提出了一种轻量级且内存高效的方法来向 PLM 注入属性。我们扩展适配器,即微型插件前馈模块,以包含独立于文本或与文本联合的属性。为了限制参数的增加,特别是当属性词汇量很大时,我们使用低秩近似和超复杂乘法,显着降低了总参数。我们还引入了训练机制来处理属性可以是多标签或稀疏的域。对来自不同领域的八个数据集的大量实验和分析表明,我们的方法优于以前的属性注入方法,并在各种数据集上实现了最先进的性能。
更新日期:2021-09-17
down
wechat
bug