Lexical Sememe Prediction using Dictionary Definitions by Capturing Local Semantic Correspondence,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Lexical Sememe Prediction using Dictionary Definitions by Capturing Local Semantic Correspondence
arXiv - CS - Computation and Language Pub Date : 2020-01-16 , DOI: arxiv-2001.05954
Jiaju Du, Fanchao Qi, Maosong Sun, Zhiyuan Liu

Sememes, defined as the minimum semantic units of human languages in linguistics, have been proven useful in many NLP tasks. Since manual construction and update of sememe knowledge bases (KBs) are costly, the task of automatic sememe prediction has been proposed to assist sememe annotation. In this paper, we explore the approach of applying dictionary definitions to predicting sememes for unannotated words. We find that sememes of each word are usually semantically matched to different words in its dictionary definition, and we name this matching relationship local semantic correspondence. Accordingly, we propose a Sememe Correspondence Pooling (SCorP) model, which is able to capture this kind of matching to predict sememes. We evaluate our model and baseline methods on a famous sememe KB HowNet and find that our model achieves state-of-the-art performance. Moreover, further quantitative analysis shows that our model can properly learn the local semantic correspondence between sememes and words in dictionary definitions, which explains the effectiveness of our model. The source codes of this paper can be obtained from https://github.com/thunlp/scorp.

中文翻译：

通过捕获本地语义对应使用字典定义进行词法义原预测

Sememes，在语言学中定义为人类语言的最小语义单位，已被证明在许多 NLP 任务中很有用。由于手动构建和更新义原知识库 (KB) 的成本很高，因此提出了自动义原预测任务来辅助义原注释。在本文中，我们探索了应用字典定义来预测未注释单词的义原的方法。我们发现每个词的义原通常在其字典定义中与不同的词在语义上匹配，我们将这种匹配关系命名为局部语义对应。因此，我们提出了一个 Sememe Correspondence Pooling (SCorP) 模型，它能够捕捉这种匹配来预测义原。我们在著名的义原 KB HowNet 上评估我们的模型和基线方法，发现我们的模型达到了最先进的性能。此外，进一步的定量分析表明，我们的模型可以正确地学习字典定义中义原和单词之间的局部语义对应关系，这解释了我们模型的有效性。本文源代码可从 https://github.com/thunlp/scorp 获取。

更新日期：2020-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文