当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Nom document digitalization by deep convolution neural networks
Pattern Recognition Letters ( IF 5.1 ) Pub Date : 2020-02-19 , DOI: 10.1016/j.patrec.2020.02.015
Kha Cong Nguyen , Cuong Tuan Nguyen , Masaki Nakagawa

Nom is an ancient script used in Vietnam until the current Latin-based Vietnamese alphabet became common, and a large number of ancient Nom documents are in existence. Due to the gradual degradation of Nom documents and a decrease in the number of scholars who can understand them, a system to digitalize Nom documents is urgently necessary. This paper presents a segmentation-based method for digitalizing Nom documents using deep convolution neural networks. Nom pages are preprocessed, segmented into isolated characters, and then recognized by a single-character OCR. The structure of the U-Net is applied to create segmentation maps and extract character regions from them. Subsequently, we propose coarse and fine combined classifiers to recognize each character pattern. The results by the best classifier are revised by a decoder using a langue model. The decoder is the same as the connectionist temporal classification decoder used in end-to-end text recognition systems. Compared with the traditional segmentation method using projection profiles and the Voronoi diagram (IoU = 81.23%), the segmentation method using the deep convolution neural network produces a better result (IoU = 92.08%) for detecting character regions. The proposed CNN models for recognizing segmented character patterns outperforms the traditional models using the modified quadratic discriminant function and the learning vector quantization with the recognition rate of 85.07%. The combination of coarse and fine classifiers, the training dataset with salt and pepper noises, and the attention layer are the key factors in the recognition rate improvement.



中文翻译:

深度卷积神经网络对Nom文档进行数字化

Nom是越南使用的一种古代文字,直到当前以拉丁文为基础的越南字母变得普遍,并且存在大量的古代Nom文件。由于Nom文档的逐步退化和能够理解它们的学者数量的减少,迫切需要一种将Nom文档数字化的系统。本文提出了一种使用深度卷积神经网络将Nom文档数字化的基于分段的方法。Nom页面经过预处理,分割成孤立的字符,然后由单字符OCR识别。U-Net的结构用于创建分割图并从中提取字符区域。随后,我们提出了粗略和精细的组合分类器,以识别每个字符模式。最佳分类器的结果由解码器使用语言模型进行修改。该解码器与在端到端文本识别系统中使用的连接器时间分类解码器相同。与使用投影轮廓和Voronoi图的传统分割方法相比(IoU  = 81.23%),使用深度卷积神经网络的分割方法可以更好地 检测字符区域(IoU = 92.08%)。提出的用于识别分段字符模式的CNN模型优于传统的使用改进的二次判别函数和学习矢量量化的模型,其识别率为85.07%。粗分类器和细分类器的组合,带有盐和胡椒噪声的训练数据集以及关注层是提高识别率的关键因素。

更新日期:2020-03-07
down
wechat
bug