Transformation Invariant Pashto Handwritten Text Classification and Prediction,Journal of Circuits, Systems and Computers

当前位置： X-MOL 学术 › J. Circuits Syst. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Transformation Invariant Pashto Handwritten Text Classification and Prediction
Journal of Circuits, Systems and Computers ( IF 1.5 ) Pub Date : 2022-08-17 , DOI: 10.1142/s0218126623500202
Muhammad Shabir ₁ , Naveed Islam ₁ , Zahoor Jan ₁ , Inayat Khan ₂

Affiliation

The use of handwritten recognition tools has increased yearly in various commercialized fields. Due to this, handwritten classification, recognition, and detection have become an exciting research subject for many scholars. Different techniques have been provided to improve character recognition accuracy while reducing time for languages like English, Arabic, Chinese and European languages. The local or regional languages need to consider for research to increase the scope of handwritten recognition tools to the global level. This paper presents a machine learning-based technique that provides an accurate, robust, and fast solution for handwritten Pashto text classification and recognition. Pashto belongs to cursive script division, which has numerous challenges to classify and recognize. The first challenge during this research is developing efficient and full-fledged datasets. The efficient recognition or prediction of Pashto handwritten text is impossible by using ordinary feature extraction due to natural transformations and handwriting variations. We propose some useful invariant features extracting techniques for handwritten Pashto text, i.e., radial, orthographic grid, perspective projection grid, retina, the slope of word trajectories, and cosine angles of tangent lines. During the dataset creation, salt and pepper noise was generated, which was removed using the statistical filter. Another challenge to face was the invalid disconnected handwritten stroke trajectory of words. We also proposed a technique to minimize the problem of disconnection of word trajectory. The proposed approach uses a linear support vector machine (SVM) and RBF-based SVM for classification and recognition.

中文翻译：

变换不变的普什图语手写文本分类和预测

手写识别工具的使用在各个商业化领域逐年增加。正因为如此，手写体的分类、识别和检测成为许多学者激动人心的研究课题。已经提供了不同的技术来提高字符识别的准确性，同时减少英语、阿拉伯语、中文和欧洲语言等语言的时间。当地或区域语言需要考虑进行研究，以将手写识别工具的范围扩大到全球水平。本文介绍了一种基于机器学习的技术，该技术为手写普什图语文本分类和识别提供了准确、稳健且快速的解决方案。普什图语属于草书体，分类识别难度较大。这项研究的第一个挑战是开发高效且成熟的数据集。由于自然变换和手写变化，使用普通的特征提取不可能有效地识别或预测普什图语手写文本。我们提出了一些有用的手写普什图语不变特征提取技术，即径向、正交网格、透视投影网格、视网膜、单词轨迹的斜率和切线的余弦角。在数据集创建过程中，生成了椒盐噪声，使用统计过滤器将其移除。面临的另一个挑战是单词的无效断开的手写笔划轨迹。我们还提出了一种技术来最小化单词轨迹断开的问题。

更新日期：2022-08-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>