SACNN: Self-attentive Convolutional Neural Network Model for Natural Language Inference,ACM Transactions on Asian and Low-Resource Language Information Processing

当前位置： X-MOL 学术 › ACM Trans. Asian Low Resour. Lang. Inf. Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SACNN: Self-attentive Convolutional Neural Network Model for Natural Language Inference
ACM Transactions on Asian and Low-Resource Language Information Processing ( IF 1.8 ) Pub Date : 2021-06-16 , DOI: 10.1145/3426884
Waris Quamer ₁ , Praphula Kumar Jain ₁ , Arpit Rai ₁ , Vijayalakshmi Saravanan ₂ , Rajendra Pamula ₃ , Chiranjeev Kumar ₃

Affiliation

Inference has been central problem for understanding and reasoning in artificial intelligence. Especially, Natural Language Inference is an interesting problem that has attracted the attention of many researchers. Natural language inference intends to predict whether a hypothesis sentence can be inferred from the premise sentence. Most prior works rely on a simplistic association between the premise and hypothesis sentence pairs, which is not sufficient for learning complex relationships between them. The strategy also fails to exploit local context information fully. Long Short Term Memory (LSTM) or gated recurrent units networks (GRU) are not effective in modeling long-term dependencies, and their schemes are far more complex as compared to Convolutional Neural Networks (CNN). To address this problem of long-term dependency, and to involve context for modeling better representation of a sentence, in this article, a general Self-Attentive Convolution Neural Network (SACNN) is presented for natural language inference and sentence pair modeling tasks. The proposed model uses CNNs to integrate mutual interactions between sentences, and each sentence with their counterparts is taken into consideration for the formulation of their representation. Moreover, the self-attention mechanism helps fully exploit the context semantics and long-term dependencies within a sentence. Experimental results proved that SACNN was able to outperform strong baselines and achieved an accuracy of 89.7% on the stanford natural language inference (SNLI) dataset.

中文翻译：

SACNN：用于自然语言推理的自注意力卷积神经网络模型

推理一直是人工智能理解和推理的核心问题。特别是，自然语言推理是一个有趣的问题，引起了许多研究人员的关注。自然语言推理旨在预测是否可以从前提语句中推断出假设语句。大多数先前的工作依赖于前提和假设句子对之间的简单关联，这不足以学习它们之间的复杂关系。该策略也未能充分利用本地上下文信息。长短期记忆 (LSTM) 或门控循环单元网络 (GRU) 在长期依赖建模方面无效，并且与卷积神经网络 (CNN) 相比，它们的方案要复杂得多。为了解决这个长期依赖的问题，并且为了对句子的更好表示进行建模，在本文中，提出了一种通用的自注意力卷积神经网络 (SACNN)，用于自然语言推理和句子对建模任务。所提出的模型使用 CNN 来整合句子之间的相互交互，并且每个句子与其对应的句子都被考虑到它们的表示形式中。此外，自注意力机制有助于充分利用句子中的上下文语义和长期依赖关系。实验结果证明，SACNN 能够超越强基线，并在斯坦福自然语言推理 (SNLI) 数据集上达到 89.7% 的准确率。提出了一种通用的自注意力卷积神经网络（SACNN），用于自然语言推理和句子对建模任务。所提出的模型使用 CNN 来整合句子之间的相互交互，并且每个句子与其对应的句子都被考虑到它们的表示形式中。此外，自注意力机制有助于充分利用句子中的上下文语义和长期依赖关系。实验结果证明，SACNN 能够超越强基线，并在斯坦福自然语言推理 (SNLI) 数据集上达到 89.7% 的准确率。提出了一种通用的自注意力卷积神经网络（SACNN），用于自然语言推理和句子对建模任务。所提出的模型使用 CNN 来整合句子之间的相互交互，并且每个句子与其对应的句子都被考虑到它们的表示形式中。此外，自注意力机制有助于充分利用句子中的上下文语义和长期依赖关系。实验结果证明，SACNN 能够超越强基线，并在斯坦福自然语言推理 (SNLI) 数据集上达到 89.7% 的准确率。并考虑到每个句子及其对应物的表达方式。此外，自注意力机制有助于充分利用句子中的上下文语义和长期依赖关系。实验结果证明，SACNN 能够超越强基线，并在斯坦福自然语言推理 (SNLI) 数据集上达到 89.7% 的准确率。并考虑到每个句子及其对应物的表达方式。此外，自注意力机制有助于充分利用句子中的上下文语义和长期依赖关系。实验结果证明，SACNN 能够超越强基线，并在斯坦福自然语言推理 (SNLI) 数据集上达到 89.7% 的准确率。

更新日期：2021-06-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11