当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Keyword-Attentive Deep Semantic Matching
arXiv - CS - Computation and Language Pub Date : 2020-03-11 , DOI: arxiv-2003.11516
Changyu Miao, Zhen Cao and Yik-Cheung Tam

Deep Semantic Matching is a crucial component in various natural language processing applications such as question and answering (QA), where an input query is compared to each candidate question in a QA corpus in terms of relevance. Measuring similarities between a query-question pair in an open domain scenario can be challenging due to diverse word tokens in the queryquestion pair. We propose a keyword-attentive approach to improve deep semantic matching. We first leverage domain tags from a large corpus to generate a domain-enhanced keyword dictionary. Built upon BERT, we stack a keyword-attentive transformer layer to highlight the importance of keywords in the query-question pair. During model training, we propose a new negative sampling approach based on keyword coverage between the input pair. We evaluate our approach on a Chinese QA corpus using various metrics, including precision of retrieval candidates and accuracy of semantic matching. Experiments show that our approach outperforms existing strong baselines. Our approach is general and can be applied to other text matching tasks with little adaptation.

中文翻译:

关键词注意力深度语义匹配

深度语义匹配是各种自然语言处理应用程序(例如问答 (QA))中的重要组成部分,其中将输入查询与 QA 语料库中的每个候选问题的相关性进行比较。由于查询问题对中的单词标记不同,在开放域场景中测量查询-问题对之间的相似性可能具有挑战性。我们提出了一种关注关键字的方法来改进深度语义匹配。我们首先利用来自大型语料库的域标签来生成域增强的关键字字典。基于 BERT,我们堆叠了一个关注关键字的转换器层,以突出查询-问题对中关键字的重要性。在模型训练期间,我们提出了一种基于输入对之间关键字覆盖率的新负采样方法。我们使用各种指标评估我们在中文 QA 语料库上的方法,包括检索候选的精度和语义匹配的准确性。实验表明,我们的方法优于现有的强基线。我们的方法是通用的,可以应用于其他文本匹配任务,几乎没有适应性。
更新日期:2020-03-26
down
wechat
bug