当前位置: X-MOL 学术Comput. Math. Organ. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Meta features-based scale invariant OCR decision making using LSTM-RNN
Computational and Mathematical Organization Theory ( IF 1.8 ) Pub Date : 2018-03-20 , DOI: 10.1007/s10588-018-9265-9
Asma Naseer , Kashif Zafar

Urdu optical character recognition (OCR) is a complex problem due to the nature of its script, which is cursive. Recognizing characters of different font sizes further complicates the problem. In this research, long short term memory-recurrent neural network (LSTM-RNN) and convolution neural network (CNN) are used to recognize Urdu optical characters of different font sizes. LSTM-RNN is trained on formerly extracted feature sets, which are extracted for scale invariant recognition of Urdu characters. From these features, LSTM-RNN extracts meta features. CNN is trained on raw binary images. Two benchmark datasets, i.e. centre for language engineering text images (CLETI) and Urdu printed text images (UPTI) are used. LSTM-RNN reveals consistent results on both datasets, and outperforms CNN. Maximum 99% accuracy is achieved using LSTM-RNN.

中文翻译:

使用LSTM-RNN基于元特征的尺度不变OCR决策

乌尔都语光学字符识别(OCR)由于其草书的性质而成为一个复杂的问题。识别不同字体大小的字符会使问题进一步复杂化。在这项研究中,长期短期记忆递归神经网络(LSTM-RNN)和卷积神经网络(CNN)用于识别不同字体大小的Urdu光学字符。LSTM-RNN在以前提取的特征集上进行训练,提取这些特征集是为了对Urdu字符进行尺度不变识别。LSTM-RNN从这些功能中提取元功能。CNN经过原始二进制图像训练。使用了两个基准数据集,即语言工程文本图像中心(CLETI)和乌尔都语印刷文本图像(UPTI)。LSTM-RNN揭示了两个数据集上一致的结果,并且胜过了CNN。使用LSTM-RNN可获得最高99%的精度。
更新日期:2018-03-20
down
wechat
bug