当前位置: X-MOL 学术Connect. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep transfer learning mechanism for fine-grained cross-domain sentiment classification
Connection Science ( IF 5.3 ) Pub Date : 2021-05-17 , DOI: 10.1080/09540091.2021.1912711
Zixuan Cao 1 , Yongmei Zhou 1, 2 , Aimin Yang 3 , Sancheng Peng 4
Affiliation  

The goal of cross-domain sentiment classification is to utilise useful information in the source domain to help classify sentiment polarity in the target domain, which has a large number of unlabelled data. Most of the existing methods focus on extracting the invariant features between two domains. But they cannot make better use of the unlabelled data in the target domain. To solve this problem, we present a deep transfer learning mechanism (DTLM) for fine-grained cross-domain sentiment classification. DTLM provides a transfer mechanism to better transfer sentiment across domains by incorporating BERT(Bidirextional Encoder Representations from Transformers) and KL (Kullback-Leibler) divergence. We introduce BERT as a feature encoder to map the text data of different domains into a shared feature space. Then, we design a domain adaptive model using KL divergence to eliminate the difference of feature distribution between the source domain and target domain. In addition, we introduce the entropy minimisation and consistency regularisation to process unlabelled samples in the target domain. Extensive experiments on the datasets from YelpAspect, SemEval 2014 task 4 and Twitter not only demonstrate the effectiveness of our proposed method but also provide a better way for cross-domain sentiment classification.



中文翻译:

细粒度跨域情感分类的深度迁移学习机制

跨域情感分类的目标是利用源域中的有用信息来帮助对具有大量未标记数据的目标域中的情感极性进行分类。现有的大多数方法都侧重于提取两个域之间的不变特征。但是他们无法更好地利用目标域中的未标记数据。为了解决这个问题,我们提出了一种用于细粒度跨域情感分类的深度迁移学习机制(DTLM)。DTLM 通过结合 BERT(来自 Transformers 的双向编码器表示)和 KL(Kullback-Leibler)散度,提供了一种更好地跨域传递情感的传递机制。我们引入 BERT 作为特征编码器,将不同域的文本数据映射到共享特征空间。然后,我们使用 KL 散度设计了一个域自适应模型,以消除源域和目标域之间的特征分布差异。此外,我们引入了熵最小化和一致性正则化来处理目标域中的未标记样本。在来自 YelpAspect、SemEval 2014 任务 4 和 Twitter 的数据集上进行的大量实验不仅证明了我们提出的方法的有效性,而且还为跨域情感分类提供了更好的方法。

更新日期:2021-05-17
down
wechat
bug