Attention in Natural Language Processing,IEEE Transactions on Neural Networks and Learning Systems

当前位置： X-MOL 学术 › IEEE Trans. Neural Netw. Learn. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Attention in Natural Language Processing
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2020-09-10 , DOI: 10.1109/tnnls.2020.3019893
Andrea Galassi , Marco Lippi , Paolo Torroni

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

中文翻译：

自然语言处理中的注意力

注意是一种越来越流行的机制，用于广泛的神经架构中。该机制本身已以多种格式实现。然而，由于该领域的快速发展，仍然缺乏对注意力的系统概述。在本文中，我们为自然语言处理中的注意力架构定义了一个统一的模型，重点关注那些旨在处理文本数据的向量表示的模型。我们根据四个维度提出了注意力模型的分类：输入的表示、兼容性函数、分布函数以及输入和/或输出的多样性。我们展示了如何在注意力模型中利用先验信息的示例，并讨论该领域正在进行的研究工作和开放挑战，

更新日期：2020-09-10

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11