Parts-of-Speech tagging for Malayalam using deep learning techniques,International Journal of Information Technology

当前位置： X-MOL 学术 › Int. J. Inf. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Parts-of-Speech tagging for Malayalam using deep learning techniques
International Journal of Information Technology Pub Date : 2020-06-16 , DOI: 10.1007/s41870-020-00491-z
K. K. Akhil , R. Rajimol , V. S. Anoop

Parts-of-speech tagging is a process in linguistics which deals with tagging each word in a sentence with their corresponding parts-of-speech. This process is considered to be one of the pre-processing steps for many natural language processing tasks. Earlier approaches were based on simple heuristics and later several methods were reported in the literature that incorporated machine learning techniques such as artificial neural networks. Very recently, with the advancement of deep learning-based approaches, parts-of-speech tagging process became more accurate and a reasonable number of taggers are now available for high resource languages such as English. But the low resource languages such as Malayalam is still lacking computationally efficient and accurate methods and techniques for parts-of-speech tagging. In this direction, this work proposes a deep learning-based approach for parts-of-speech tagging for the Malayalam language. Experiments conducted on real datasets show that the proposed method outperforms some of the already available methods in terms of precision and accuracy.

中文翻译：

使用深度学习技术为马拉雅拉姆语进行词性标注

词性标记是语言学中的一个过程，该过程涉及用相应词性标记句子中的每个单词。该过程被认为是许多自然语言处理任务的预处理步骤之一。较早的方法基于简单的启发式方法，后来在文献中报道了几种结合了机器学习技术（例如人工神经网络）的方法。最近，随着基于深度学习的方法的发展，词性标记过程变得更加准确，并且对于高资源语言（例如英语），现在可以使用合理数量的标记器。但是，诸如马拉雅拉姆语之类的资源匮乏的语言仍缺乏用于词性标记的计算有效且准确的方法和技术。在这个方向上这项工作提出了一种基于深度学习的方法，用于马拉雅拉姆语语言的词性标注。在真实数据集上进行的实验表明，该方法在准确性和准确性方面优于某些现有方法。

更新日期：2020-06-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>