使用具有规则激活功能的深度神经网络进行基准普什图语手写字符数据集和普什图语对象字符识别（OCR）,Complexity

当前位置： X-MOL 学术 › Complexity › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

使用具有规则激活功能的深度神经网络进行基准普什图语手写字符数据集和普什图语对象字符识别（OCR）
Complexity ( IF 1.7 ) Pub Date : 2021-03-05 , DOI: 10.1155/2021/6669672
Imran Uddin ₁ , Dzati A. Ramli ₂ , Abdullah Khan ₁ , Javed Iqbal Bangash ₁ , Nosheen Fayyaz ₃ , Asfandyar Khan ₁ , Mahwish Kundi ₄

Affiliation

在机器学习领域，使用了不同的技术来训练机器并执行不同的任务，例如计算机视觉，数据分析，自然语言处理和语音识别。计算机视觉是应用机器学习和深度学习技术的主要分支之一。光学字符识别（OCR）是机器识别语言字符的能力。普什图语是世界上最古老的历史语言之一，在阿富汗和巴基斯坦使用。OCR应用程序已针对多种草书语言（如乌尔都语，中文和日语）进行了开发，但是在识别普什图语方面所做的工作很少。在手写字符识别方面，由于每个手写字符的形状都受作者的手部动力学的影响，因此OCR识别字符变得更加困难。与其他语言相比，缺乏普什图语手写字符数据研究的原因是因为没有可用于实验目的的基准数据集。这项研究的重点是创建这样的数据集，然后出于评估目的，对机器进行了培训，以正确识别看不见的普什图语手写字符。为了实现这一目标，创建了一个包含43000张图像的数据集。训练并测试了三种具有反向传播算法的前馈神经网络模型，这些模型使用不同的整流线性单位（ReLU）层配置（具有1-ReLU层的模型1，具有2-ReLU层的模型2和具有3-ReLU层的模型3）进行了测试。该数据集。仿真显示，模型1在看不见的数据上的准确度高达87.6％，而模型2的准确度分别为81.60％和3％。同样，模型1的损失（交叉熵）最低，训练和测试为0.15和3.17，其次模型2的训练和测试为0.7和4.2，而模型3的损失（交叉熵）为最后一个，损失值为6.4和3.69。模型1的精度，召回率和f度量值均优于模型2和模型3。根据结果，发现模型1（具有1个ReLU激活层）是最有效的。两种模式在识别帕什托手写字符的准确性方面。模型1的损失（交叉熵）最低，在训练和测试上分别为0.15和3.17，其次是模型2，在训练和测试上分别为0.7和4.2，而模型3则是最后一个，损失值为6.4和3.69。模型1的精度，召回率和f度量值均优于模型2和模型3。根据结果，发现模型1（具有1个ReLU激活层）是最有效的。两种模式在识别帕什托手写字符的准确性方面。模型1的损失（交叉熵）最低，在训练和测试上分别为0.15和3.17，其次是模型2，在训练和测试中分别为0.7和4.2，而模型3在损失和数值上分别为6.4和3.69。模型1的精度，召回率和f度量值均优于模型2和模型3。根据结果，发现模型1（具有1个ReLU激活层）是最有效的。两种模式在识别帕什托手写字符的准确性方面。

"点击查看英文标题和摘要"

Benchmark Pashto Handwritten Character Dataset and Pashto Object Character Recognition (OCR) Using Deep Neural Network with Rule Activation Function

In the area of machine learning, different techniques are used to train machines and perform different tasks like computer vision, data analysis, natural language processing, and speech recognition. Computer vision is one of the main branches where machine learning and deep learning techniques are being applied. Optical character recognition (OCR) is the ability of a machine to recognize the character of a language. Pashto is one of the most ancient and historical languages of the world, spoken in Afghanistan and Pakistan. OCR application has been developed for various cursive languages like Urdu, Chinese, and Japanese, but very little work is done for the recognition of the Pashto language. When it comes to handwritten character recognition, it becomes more difficult for OCR to recognize the characters as every handwritten character’s shape is influenced by the writer’s hand motion dynamics. The reason for the lack of research in Pashto handwritten character data as compared to other languages is because there is no benchmark dataset available for experimental purposes. This study focuses on the creation of such a dataset, and then for the evaluation purpose, a machine is trained to correctly recognize unseen Pashto handwritten characters. To achieve this objective, a dataset of 43000 images was created. Three Feed Forward Neural Network models with backpropagation algorithm using different Rectified Linear Unit (ReLU) layer configurations (Model 1 with 1-ReLU Layer, Model 2 with 2-ReLU layers, and Model 3 with 3-ReLU Layers) were trained and tested with this dataset. The simulation shows that Model 1 achieved accuracy up to 87.6% on unseen data while Model 2 achieved an accuracy of 81.60% and 3% accuracy, respectively. Similarly, loss (cross-entropy) was the lowest for Model 1 with 0.15 and 3.17 for training and testing, followed by Model 2 with 0.7 and 4.2 for training and testing, while Model 3 was the last with loss values of 6.4 and 3.69. The precision, recall, and f-measure values of Model 1 were better than those of both Model 2 and Model 3. Based on results, Model 1 (with 1 ReLU activation layer) is found to be the most efficient as compared to the other two models in terms of accuracy to recognize Pashto handwritten characters.

更新日期：2021-03-05

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11