A Discriminative Convolutional Neural Network with Context-aware Attention,ACM Transactions on Intelligent Systems and Technology

当前位置： X-MOL 学术 › ACM Trans. Intell. Syst. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Discriminative Convolutional Neural Network with Context-aware Attention
ACM Transactions on Intelligent Systems and Technology ( IF 7.2 ) Pub Date : 2020-07-07 , DOI: 10.1145/3397464
Yuxiang Zhou ₁ , Lejian Liao ₁ , Yang Gao ₁ , Heyan Huang ₁ , Xiaochi Wei ₂

Affiliation

Feature representation and feature extraction are two crucial procedures in text mining. Convolutional Neural Networks (CNN) have shown overwhelming success for text-mining tasks, since they are capable of efficiently extracting n -gram features from source data. However, vanilla CNN has its own weaknesses on feature representation and feature extraction. A certain amount of filters in CNN are inevitably duplicate and thus hinder to discriminatively represent a given text. In addition, most existing CNN models extract features in a fixed way (i.e., max pooling) that either limit the CNN to local optimum nor without considering the relation between all features, thereby unable to learn a contextual n -gram features adaptively. In this article, we propose a discriminative CNN with context-aware attention to solve the challenges of vanilla CNN. Specifically, our model mainly encourages discrimination across different filters via maximizing their earth mover distances and estimates the salience of feature candidates by considering the relation between context features. We validate carefully our findings against baselines on five benchmark datasets of classification and two datasets of summarization. The results of the experiments verify the competitive performance of our proposed model.

中文翻译：

具有上下文感知注意的判别卷积神经网络

特征表示和特征提取是文本挖掘中的两个关键过程。卷积神经网络 (CNN) 在文本挖掘任务中显示出压倒性的成功，因为它们能够有效地提取n-gram 来自源数据的特征。然而，vanilla CNN 在特征表示和特征提取方面有其自身的弱点。CNN 中一定数量的过滤器不可避免地是重复的，因此阻碍了有区别地表示给定文本。此外，大多数现有的 CNN 模型以固定方式（即最大池化）提取特征，要么将 CNN 限制在局部最优，要么不考虑所有特征之间的关系，从而无法学习上下文n-gram 自适应特征。在本文中，我们提出了一种具有上下文感知注意力的判别式 CNN，以解决 vanilla CNN 的挑战。具体来说，我们的模型主要通过最大化它们的推土机距离来鼓励跨不同过滤器的区分，并通过考虑上下文特征之间的关系来估计特征候选者的显着性。我们根据五个分类基准数据集和两个汇总数据集的基线仔细验证了我们的发现。实验结果验证了我们提出的模型的竞争性能。

更新日期：2020-07-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11