当前位置: X-MOL 学术Egypt. Inform. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A simple Galois Power-of-Two real time embedding scheme for performing Arabic morphology deep learning tasks
Egyptian Informatics Journal ( IF 5.2 ) Pub Date : 2020-04-21 , DOI: 10.1016/j.eij.2020.03.002
Mohammed A. ELAffendi , Ibrahim Abuhaimed , Khawla AlRajhi

This paper describes how a simple novel Galois Power-of-Two (GPOW2) real-time embedding scheme is used to improve the performance and accuracy of downstream NLP tasks. GPOW2 computes embeddings live on the fly (real time) in the context of target NLP tasks without the need for tabulated pre-embeddings. One excellent feature of the method is the ability to capture multilevel embeddings in the same pass. It simultaneously computes character, word and sentence embeddings on the fly. GPOW2 has been derived in the context of attempts to improve the performance of the SWAM Arabic morphological engine, which is a multipurpose tool that supports segmentation, classification, POS tagging, spell checking, word embeddings, sematic search, among other tasks. SWAM is a pattern-oriented algorithm that relies on morphological patterns and POS tagging to perform NLP tasks. The paper demonstrates how GPOW2 led to improvements in the accuracy of POS tagging and pattern matching, and accordingly the performance of the whole engine. The accuracy for pattern prediction is 99.47% and is 98.80% for POS tagging.



中文翻译:

一个简单的Galois二次幂幂实时嵌入方案,用于执行阿拉伯语形态学深度学习任务

本文介绍了如何使用简单的新颖Galois二次幂(GPOW2)实时嵌入方案来提高下游NLP任务的性能和准确性。GPOW2在目标NLP任务的上下文中实时(实时)实时计算嵌入,而无需列表式预嵌入。该方法的一项出色功能是能够在同一遍中捕获多层嵌入。它可以同时动态计算字符,单词和句子的嵌入。GPOW2是在尝试提高SWAM阿拉伯语形态引擎性能的背景下派生的,SWAM阿拉伯语形态引擎是一种多功能工具,支持分段,分类,POS标记,拼写检查,单词嵌入,语义搜索以及其他任务。SWAM是一种面向模式的算法,它依赖于形态学模式和POS标签来执行NLP任务。本文演示了GPOW2如何导致POS标记和模式匹配的准确性以及整个引擎的性能的提高。模式预测的准确性为99.47%,而POS标记的准确性为98.80%。

更新日期:2020-04-21
down
wechat
bug