Disambiguating Arabic Words According to Their Historical Appearance in the Document Based on Recurrent Neural Networks,ACM Transactions on Asian and Low-Resource Language Information Processing

当前位置： X-MOL 学术 › ACM Trans. Asian Low Resour. Lang. Inf. Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Disambiguating Arabic Words According to Their Historical Appearance in the Document Based on Recurrent Neural Networks
ACM Transactions on Asian and Low-Resource Language Information Processing ( IF 1.8 ) Pub Date : 2020-10-16 , DOI: 10.1145/3410569
Rim Laatar ₁ , Chafik Aloulou ₁ , Lamia Hadrich Belguith ₁

Affiliation

How can we determine the semantic meaning of a word in relation to its context of appearance? We eventually have to grabble with this difficult question, as one of the paramount problems of Natural Language Processing (NLP). In other words, this issue is commonly defined as Word Sense Disambiguation (WSD). The latter is one of the crucial difficulties within the NLP field. In this respect, word vectors extracted from a neural network model have been successfully applied for resolving the WSD problem. Accordingly, this article presents an unprecedented method to disambiguate Arabic words according to both their contextual appearance in a source text and the era in which they emerged. In fact, in the few previous decades, many researchers have been grabbling with Arabic Word Sense Disambiguation. It should be noted that the Arabic language can be divided into three major historical periods: old Arabic, middle-age Arabic, and contemporary Arabic. Actually, contemporary Arabic has proved to be the greatest concern of many researchers. The main gist of our work is to disambiguate Arabic words according to the historical period in which they appeared. To perform such a task, we suggest a method that deploys contextualized word embeddings to better gather valid syntactic and semantic information of the same word by taking into account its contextual uses. The preponderant thing is to convert both the senses and the contextual uses of an ambiguous item to vectors, then determine which of the possible conceptual meanings of the target word is closer to the given context.

中文翻译：

基于循环神经网络的文档中根据历史出现的阿拉伯词消歧

我们如何确定一个单词相对于它的出现上下文的语义？作为自然语言处理 (NLP) 的首要问题之一，我们最终不得不解决这个难题。换句话说，这个问题通常被定义为词义消歧（WSD）。后者是 NLP 领域的关键难题之一。在这方面，从神经网络模型中提取的词向量已成功应用于解决 WSD 问题。因此，本文提出了一种前所未有的方法来根据源文本中的上下文外观和它们出现的时代来消除阿拉伯语单词的歧义。事实上，在过去的几十年里，许多研究人员一直在努力解决阿拉伯语词义消歧问题。需要说明的是，阿拉伯语可以分为三大历史时期：古代阿拉伯语、中年阿拉伯语和现代阿拉伯语。事实上，当代阿拉伯语已被证明是许多研究人员最关心的问题。我们工作的主要精神是根据它们出现的历史时期来消除阿拉伯语单词的歧义。为了执行这样的任务，我们建议了一种方法，该方法部署上下文化的词嵌入，以通过考虑其上下文使用来更好地收集同一单词的有效句法和语义信息。最主要的是将歧义项的含义和上下文使用转换为向量，然后确定目标词的哪些可能的概念含义更接近给定的上下文。和当代阿拉伯语。事实上，当代阿拉伯语已被证明是许多研究人员最关心的问题。我们工作的主要精神是根据它们出现的历史时期来消除阿拉伯语单词的歧义。为了执行这样的任务，我们建议了一种方法，该方法部署上下文化的词嵌入，以通过考虑其上下文使用来更好地收集同一单词的有效句法和语义信息。最主要的是将歧义项的含义和上下文使用转换为向量，然后确定目标词的哪些可能的概念含义更接近给定的上下文。和当代阿拉伯语。事实上，当代阿拉伯语已被证明是许多研究人员最关心的问题。我们工作的主要精神是根据它们出现的历史时期来消除阿拉伯语单词的歧义。为了执行这样的任务，我们建议了一种方法，该方法部署上下文化的词嵌入，以通过考虑其上下文使用来更好地收集同一单词的有效句法和语义信息。最主要的是将歧义项的含义和上下文使用转换为向量，然后确定目标词的哪些可能的概念含义更接近给定的上下文。我们工作的主要精神是根据它们出现的历史时期来消除阿拉伯语单词的歧义。为了执行这样的任务，我们建议了一种方法，该方法部署上下文化的词嵌入，以通过考虑其上下文使用来更好地收集同一单词的有效句法和语义信息。最主要的是将歧义项的含义和上下文使用转换为向量，然后确定目标词的哪些可能的概念含义更接近给定的上下文。我们工作的主要精神是根据它们出现的历史时期来消除阿拉伯语单词的歧义。为了执行这样的任务，我们建议了一种方法，该方法部署上下文化的词嵌入，以通过考虑其上下文使用来更好地收集同一单词的有效句法和语义信息。最主要的是将歧义项的含义和上下文使用转换为向量，然后确定目标词的哪些可能的概念含义更接近给定的上下文。

更新日期：2020-10-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11