当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Two streams deep neural network for handwriting word recognition
Multimedia Tools and Applications ( IF 3.6 ) Pub Date : 2020-10-07 , DOI: 10.1007/s11042-020-09923-1
Alaa Sulaiman , Khairuddin Omar , Mohammad F. Nasrudin

Handwritten word recognition is one of the hot topics in automatic handwritten text recognition that received a lot of attention in recent years. Unlike character recognition, word recognition deals with considerable variations in word shape and written style. This paper proposes a novel deep model for language-independent handwritten word recognition. The proposed deep structure has two parallel stages for jointly learning character and word-level information. In the character-level stage, a weakly character segmentation method is performed and then applies a series of Long short-term memory (LSTM) layers for character-level representation. The word-level stage employs a series of convolutional layers for the shape and structure representation of the word. These representations are then concatenated and followed by a series of fully connected layers for jointly learning the words and the character-level information. Since the character segmentation is language independent and error-prone, the proposed deep structure only applies weakly separation scheme and does not rely on any character segmentation algorithm. Thus, it effectively utilizes character level representation without bounding on any language model. In the proposed methodology, we use two new data augmentation strategies based on a psychological assumption to increase the model generalization performance. Experimental results on five public datasets including Arabic, English and German languages demonstrate that the proposed deep model has a superior performance to the state-of-the-art methods.



中文翻译:

两流深度神经网络用于手写单词识别

手写单词识别是近年来自动手写文本识别中的热门话题之一。与字符识别不同,单词识别处理的是单词形状和书写样式的大量变化。本文提出了一种独立于语言的手写单词识别的新型深度模型。所提出的深层结构具有两个并行的阶段,可以共同学习字符和单词级别的信息。在字符级阶段,执行弱字符分割方法,然后应用一系列Long短期记忆(LSTM)层用于字符级表示。单词级阶段使用一系列卷积层来表示单词的形状和结构。然后将这些表示连接起来,然后是一系列完全连接的层,以共同学习单词和字符级信息。由于字符分割是独立于语言的并且容易出错,因此所提出的深层结构仅应用了弱分离方案,并且不依赖于任何字符分割算法。因此,它有效地利用了字符级表示,而不受任何语言模型的限制。在提出的方法中,我们基于心理假设使用两种新的数据增强策略来提高模型的泛化性能。在五个公共数据集(包括阿拉伯文,

更新日期:2020-10-07
down
wechat
bug