当前位置:
X-MOL 学术
›
arXiv.cs.CL
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Task-adaptive Pre-training of Language Models with Word Embedding Regularization
arXiv - CS - Computation and Language Pub Date : 2021-09-17 , DOI: arxiv-2109.08354 Kosuke Nishida, Kyosuke Nishida, Sen Yoshida
arXiv - CS - Computation and Language Pub Date : 2021-09-17 , DOI: arxiv-2109.08354 Kosuke Nishida, Kyosuke Nishida, Sen Yoshida
Pre-trained language models (PTLMs) acquire domain-independent linguistic
knowledge through pre-training with massive textual resources. Additional
pre-training is effective in adapting PTLMs to domains that are not well
covered by the pre-training corpora. Here, we focus on the static word
embeddings of PTLMs for domain adaptation to teach PTLMs domain-specific
meanings of words. We propose a novel fine-tuning process: task-adaptive
pre-training with word embedding regularization (TAPTER). TAPTER runs
additional pre-training by making the static word embeddings of a PTLM close to
the word embeddings obtained in the target domain with fastText. TAPTER
requires no additional corpus except for the training data of the downstream
task. We confirmed that TAPTER improves the performance of the standard
fine-tuning and the task-adaptive pre-training on BioASQ (question answering in
the biomedical domain) and on SQuAD (the Wikipedia domain) when their
pre-training corpora were not dominated by in-domain data.
中文翻译:
基于词嵌入正则化的语言模型任务自适应预训练
预训练语言模型 (PTLM) 通过使用大量文本资源进行预训练来获取与领域无关的语言知识。额外的预训练可以有效地使 PTLM 适应预训练语料库没有很好覆盖的领域。在这里,我们专注于用于领域适应的 PTLM 的静态词嵌入,以教 PTLM 特定领域的单词含义。我们提出了一种新的微调过程:带有词嵌入正则化 (TAPTER) 的任务自适应预训练。TAPTER 通过使 PTLM 的静态词嵌入接近使用 fastText 在目标域中获得的词嵌入来运行额外的预训练。除了下游任务的训练数据外,TAPTER 不需要额外的语料库。
更新日期:2021-09-20
中文翻译:
基于词嵌入正则化的语言模型任务自适应预训练
预训练语言模型 (PTLM) 通过使用大量文本资源进行预训练来获取与领域无关的语言知识。额外的预训练可以有效地使 PTLM 适应预训练语料库没有很好覆盖的领域。在这里,我们专注于用于领域适应的 PTLM 的静态词嵌入,以教 PTLM 特定领域的单词含义。我们提出了一种新的微调过程:带有词嵌入正则化 (TAPTER) 的任务自适应预训练。TAPTER 通过使 PTLM 的静态词嵌入接近使用 fastText 在目标域中获得的词嵌入来运行额外的预训练。除了下游任务的训练数据外,TAPTER 不需要额外的语料库。