当前位置:
X-MOL 学术
›
arXiv.cs.CL
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU
arXiv - CS - Computation and Language Pub Date : 2021-09-17 , DOI: arxiv-2109.08259 Meghana Moorthy Bhat, Alessandro Sordoni, Subhabrata Mukherjee
arXiv - CS - Computation and Language Pub Date : 2021-09-17 , DOI: arxiv-2109.08259 Meghana Moorthy Bhat, Alessandro Sordoni, Subhabrata Mukherjee
While pre-trained language models have obtained state-of-the-art performance
for several natural language understanding tasks, they are quite opaque in
terms of their decision-making process. While some recent works focus on
rationalizing neural predictions by highlighting salient concepts in the text
as justifications or rationales, they rely on thousands of labeled training
examples for both task labels as well as an-notated rationales for every
instance. Such extensive large-scale annotations are infeasible to obtain for
many tasks. To this end, we develop a multi-task teacher-student framework
based on self-training language models with limited task-specific labels and
rationales, and judicious sample selection to learn from informative
pseudo-labeled examples1. We study several characteristics of what constitutes
a good rationale and demonstrate that the neural model performance can be
significantly improved by making it aware of its rationalized predictions,
particularly in low-resource settings. Extensive experiments in several
bench-mark datasets demonstrate the effectiveness of our approach.
中文翻译:
小样本合理化的自我训练:教师解释帮助学生在小样本 NLU 中
虽然预训练的语言模型在多项自然语言理解任务中获得了最先进的性能,但它们在决策过程方面却相当不透明。虽然最近的一些工作侧重于通过突出文本中的显着概念作为理由或理由来合理化神经预测,但它们依赖于任务标签的数千个标记训练示例以及每个实例的注释基本原理。对于许多任务来说,获得如此广泛的大规模注释是不可行的。为此,我们开发了一个基于自训练语言模型的多任务师生框架,具有有限的任务特定标签和基本原理,以及明智的样本选择,以从信息伪标记的例子中学习。我们研究了构成良好基本原理的几个特征,并证明可以通过使其意识到其合理化预测来显着提高神经模型的性能,特别是在低资源环境中。在几个基准数据集中的大量实验证明了我们方法的有效性。
更新日期:2021-09-20
中文翻译:
小样本合理化的自我训练:教师解释帮助学生在小样本 NLU 中
虽然预训练的语言模型在多项自然语言理解任务中获得了最先进的性能,但它们在决策过程方面却相当不透明。虽然最近的一些工作侧重于通过突出文本中的显着概念作为理由或理由来合理化神经预测,但它们依赖于任务标签的数千个标记训练示例以及每个实例的注释基本原理。对于许多任务来说,获得如此广泛的大规模注释是不可行的。为此,我们开发了一个基于自训练语言模型的多任务师生框架,具有有限的任务特定标签和基本原理,以及明智的样本选择,以从信息伪标记的例子中学习。我们研究了构成良好基本原理的几个特征,并证明可以通过使其意识到其合理化预测来显着提高神经模型的性能,特别是在低资源环境中。在几个基准数据集中的大量实验证明了我们方法的有效性。