CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling,Journal of Intelligent Information Systems

当前位置： X-MOL 学术 › J. Intell. Inf. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling
Journal of Intelligent Information Systems ( IF 3.4 ) Pub Date : 2020-06-24 , DOI: 10.1007/s10844-020-00610-z
Yanping Fu , Yun Liu

Sentence modeling is a critical issue for the feature generation of some natural language processing (NLP) tasks. Recently, most works generated the sentence representation by sentence modeling based on Convolutional Neural Network (CNN), Long Short-Term Memory network (LSTM) and some attention mechanisms. However, these models have two limitations: (1) they only present sentences for one individual task by fine-tuning network parameters, and (2) sentence modeling only considers the concatenation of words and ignores the function of phrases. In this paper, we propose a Cascading Gated Self-attention and Phrase-attention Network (CGSPN) that generates the sentence embedding by considering contextual words and key phrases in a sentence. Specifically, we first present a word-interaction gating self-attention mechanism to identify some important words and build the relationship between words. Then, we cascade a phrase-attention structure by abstracting the semantic of phrases to generate the sentence representation. Experiments on different NLP tasks show that the proposed CGSPN model achieves the highest accuracy among most sentence encoding methods. It improves the latest best result by 1.76% on the Stanford Sentiment Treebank (SST), and shows the best test accuracy on different sentence classification data sets. In the Natural Language Inference (NLI) task, the performance of CGSPN without phrase-attention is better than CGSPN model itself and it obtains competitive performance against state-of-the-art baselines, which show the different applicability of the proposed model. In other NLP tasks, we also compare our model with popular methods to explore our direction.

中文翻译：

CGSPN：用于句子建模的级联门控自我注意和短语注意网络

句子建模是一些自然语言处理 (NLP) 任务的特征生成的关键问题。最近，大多数作品通过基于卷积神经网络 (CNN)、长短期记忆网络 (LSTM) 和一些注意力机制的句子建模来生成句子表示。然而，这些模型有两个局限性：（1）它们通过微调网络参数只为一项单独的任务呈现句子，（2）句子建模只考虑单词的连接，而忽略了短语的功能。在本文中，我们提出了一个级联门控自注意和短语注意网络（CGSPN），它通过考虑句子中的上下文词和关键短语来生成句子嵌入。具体来说，我们首先提出了一个词交互门控自注意力机制来识别一些重要的词并建立词之间的关系。然后，我们通过抽象短语的语义来级联短语注意力结构以生成句子表示。在不同 NLP 任务上的实验表明，所提出的 CGSPN 模型在大多数句子编码方法中达到了最高的准确率。它在斯坦福情感树库（SST）上将最新的最佳结果提高了 1.76%，并在不同的句子分类数据集上显示了最佳的测试准确率。在自然语言推理 (NLI) 任务中，没有短语注意力的 CGSPN 的性能优于 CGSPN 模型本身，并且它获得了与最先进基线的竞争性能，这表明了所提出模型的不同适用性。

更新日期：2020-06-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>