当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Usr-mtl: an unsupervised sentence representation learning framework with multi-task learning
Applied Intelligence ( IF 5.3 ) Pub Date : 2020-11-14 , DOI: 10.1007/s10489-020-02042-2
Wenshen Xu , Shuangyin Li , Yonghe Lu

Developing the utilized intelligent systems is increasingly important to learn effective text representations, especially extract the sentence features. Numerous previous studies have been concentrated on the task of sentence representation learning based on deep learning approaches. However, the present approaches are mostly proposed with the single task or replied on the labeled corpus when learning the embedding of the sentences. In this paper, we assess the factors in learning sentence representation and propose an efficient unsupervised learning framework with multi-task learning (USR-MTL), in which various text learning tasks are merged into the unitized framework. With the syntactic and semantic features of sentences, three different factors to some extent are reflected in the task of the sentence representation learning that is the wording, or the ordering of the neighbored sentences of a target sentence in other words. Hence, we integrate the word-order learning task, word prediction task, and the sentence-order learning task into the proposed framework to attain meaningful sentence embeddings. Here, the process of sentence embedding learning is reformulated as a multi-task learning framework of the sentence-level task and the two word-level tasks. Moreover, the proposed framework is motivated by an unsupervised learning algorithm utilizing the unlabeled corpus. Based on the experimental results, our approach achieves the state-of-the-art performances on the downstream natural language processing tasks compared to the popular unsupervised representation learning techniques. The experiments on representation visualization and task analysis demonstrate the effectiveness of the tasks in the proposed framework in creating reasonable sentence representations proving the capacity of the proposed unsupervised multi-task framework for the sentence representation learning.



中文翻译:

Usr-mtl:具有多任务学习的无监督句子表示学习框架

开发利用的智能系统对于学习有效的文本表示,尤其是提取句子特征越来越重要。以前的许多研究都集中在基于深度学习方法的句子表示学习的任务上。然而,当前的方法大多是由单个任务提出的,或者在学习句子的嵌入时在带标记的语料库上答复。在本文中,我们评估了学习句子表达的因素,并提出了一种有效的无监督多框架学习框架(USR-MTL),其中将各种文本学习任务合并为统一框架。借助句子的句法和语义特征,在句子表示学习的任务(即措词)中,在一定程度上反映了三个不同的因素,换句话说,目标句子的相邻句子的顺序。因此,我们将单词顺序学习任务,单词预测任务和句子顺序学习任务集成到所提出的框架中,以实现有意义的句子嵌入。在此,将句子嵌入学习的过程重新构造为句子级任务和两个单词级任务的多任务学习框架。此外,所提出的框架是由利用未标记的语料库的无监督学习算法驱动的。基于实验结果,与流行的无监督表示学习技术相比,我们的方法在下游自然语言处理任务上实现了最先进的性能。

更新日期:2020-11-15
down
wechat
bug