当前位置: X-MOL 学术Neurocomputing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A commonality-based enhancement for sentence modeling with supervision
Neurocomputing ( IF 6 ) Pub Date : 2022-07-22 , DOI: 10.1016/j.neucom.2022.07.063
Zhe Chen , Cheng Liu , Jiansi Ren

Sentence pair modeling is a fundamental yet challenging issue for feature mining in natural language processing (NLP) tasks. Recently, most works have generated feature and sentence representation based on the interactive attention mechanism. However, these models have two limitations: (1) they only consider global information through attention coefficient weighting, which makes insufficient utilization of critical features; (2) they only conduct internal training by fine-tuning network parameters, in which attention results are poorly explained. In this paper, inspired by human reasoning, we propose a Commonality Aggregated approach (CA) to enhance the lightweight interaction model by considering phrase features and contextual words. Specifically, we first fuse positional encoding and employ supervised training to extract critical phrase information from the text as the commonality of sentence pairs. Then, we deploy transfer learning and utilize interaction network to combine crucial phrase features, core word features, and positional encoding to enhance sentence pair modeling. Compared with the original network, extensive experiments on multiple benchmark datasets demonstrate the effectiveness of the proposed commonality aggregated method with stronger competitiveness. Further visual analysisanalysies validated the more explicit interpretability of attention, and extended experimental results indicate the excellent generalization of our approach.



中文翻译:

带有监督的句子建模的基于共性的增强

句子对建模是自然语言处理 (NLP) 任务中特征挖掘的一个基本但具有挑战性的问题。最近,大多数作品都基于交互式注意力机制生成了特征和句子表示。但是,这些模型有两个局限性:(1)它们仅通过注意力系数加权来考虑全局信息,这使得关键特征的利用不足;(2)他们只通过微调网络参数进行内部训练,其中注意力结果很难解释。在本文中,受人类推理的启发,我们提出了一种通用性聚合方法(CA),通过考虑短语特征和上下文词来增强轻量级交互模型。具体来说,我们首先融合位置编码并采用监督训练从文本中提取关键短语信息作为句子对的共性。然后,我们部署迁移学习并利用交互网络结合关键短语特征、核心词特征和位置编码来增强句子对建模。与原始网络相比,在多个基准数据集上的大量实验证明了所提出的共性聚合方法的有效性,具有更强的竞争力。进一步的视觉分析分析验证了注意力更明确的可解释性,扩展的实验结果表明我们的方法具有出色的泛化性。我们部署迁移学习并利用交互网络结合关键短语特征、核心词特征和位置编码来增强句子对建模。与原始网络相比,在多个基准数据集上的大量实验证明了所提出的共性聚合方法的有效性,具有更强的竞争力。进一步的视觉分析分析验证了注意力更明确的可解释性,扩展的实验结果表明我们的方法具有出色的泛化性。我们部署迁移学习并利用交互网络结合关键短语特征、核心词特征和位置编码来增强句子对建模。与原始网络相比,在多个基准数据集上的大量实验证明了所提出的共性聚合方法的有效性,具有更强的竞争力。进一步的视觉分析分析验证了注意力更明确的可解释性,扩展的实验结果表明我们的方法具有出色的泛化性。

更新日期:2022-07-27
down
wechat
bug