当前位置: X-MOL 学术Int. J. Doc. Anal. Recognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Persian handwritten digit, character and word recognition using deep learning
International Journal on Document Analysis and Recognition ( IF 1.8 ) Pub Date : 2021-04-29 , DOI: 10.1007/s10032-021-00368-2
Mahdi Bonyani , Simindokht Jahangard , Morteza Daneshmand

In spite of various applications of digit, letter and word recognition, only a few studies have dealt with Persian scripts. In this paper, deep neural networks are utilized through different DenseNet and Xception architectures, being further boosted by means of data augmentation and test time augmentation. Dividing the datasets to training, validation and test sets, and utilizing k-fold cross-validation, the comparison of the proposed method with various state-of-the-art alternatives is performed. Three datasets: HODA, Sadri and Iranshahr are used, which offer the most comprehensive collections of samples in terms of handwriting styles and the forms each letter may take depending on its position within a word. On the HODA dataset, we achieve recognition rates of 99.49% and 98.10% for digits and characters, being 99.72%, 89.99% and 98.82% for digits, characters and words from the Sadri dataset, respectively, as well as 98.99% for words from the Iranshahr dataset, each of which outperforms the performances achieved by the most advanced alternative networks, namely ResNet50 and VGG16. An additional contribution of the paper arises from its capability of words recognition as a holistic image classification. This improves the resulting speed and versatility significantly, as it does not require explicit character models, unlike earlier alternatives such as hidden Markov models and convolutional recursive neural networks. In addition, computation times have been compared with alternative state-of-the-art models and better performance has been observed.



中文翻译:

波斯语使用深度学习的手写数字,字符和单词识别

尽管数字,字母和单词识别有各种应用,但是只有很少的研究涉及波斯文字。在本文中,深层神经网络通过不同的DenseNet和Xception体系结构加以利用,并通过数据增强和测试时间增强得到了进一步的增强。将数据集划分为训练,验证和测试集,并利用k折交叉验证,将提出的方法与各种最新的替代方案进行比较。使用了三个数据集:HODA,Sadri和Iranshahr,这些数据集提供了手写风格以及每个字母根据其在单词中的位置可能采用的形式的最全面的样本集合。在HODA数据集上,我们对数字和字符的识别率分别为99.49%和98.10%,对Sadri数据集的数字,字符和单词的识别率分别为99.72%,89.99%和98.82%,对于来自Sadri数据集的单词的识别率分别为98.99% Iranshahr数据集,每个数据集都优于最先进的替代网络ResNet50和VGG16所实现的性能。该论文的另一个贡献来自其将单词识别作为整体图像分类的能力。与早期的替代方法(例如隐马尔可夫模型和卷积递归神经网络)不同,这不需要显式的字符模型,从而显着提高了生成的速度和多功能性。此外,已将计算时间与其他最新模型进行了比较,并观察到了更好的性能。

更新日期:2021-04-30
down
wechat
bug