当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-lingual scene text detection and language identification
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-06-27 , DOI: 10.1016/j.patrec.2020.06.024
Shaswata Saha , Neelotpal Chakraborty , Soumyadeep Kundu , Sayantan Paul , Ayatullah Faruk Mollah , Subhadip Basu , Ram Sarkar

Scene text analysis is a field of research that poses challenges to researchers owing to the background complexities, image quality, text orientation, text size, etc. The problem gets more complex when the image contains multi-lingual texts. Most scene text detection techniques approach the problem as either a feature-based or deep learning-based problem. In this work, an end-to-end system is proposed for scene text detection, localization and language identification to combine feature-based and deep learning-based approaches. The model uses Maximally Stable Extremal Regions and Stroke Width Transform for generating text proposals, followed by proposal refinement using Generative Adversarial Network. Finally, a Convolution Neural Network based model is used for language identification of the detected scene texts. Experiments have been conducted on standard datasets like KAIST, COCO, CTW1500, CVSI and ICDAR along with an in-house multi-lingual Indic scene text dataset for which the proposed model achieves satisfactory results.



中文翻译:

多语言场景文本检测和语言识别

场景文本分析是一个研究领域,由于背景复杂性,图像质量,文本方向,文本大小等原因,给研究人员带来了挑战。当图像包含多语言文本时,问题会变得更加复杂。大多数场景文本检测技术将问题视为基于特征的问题或基于深度学习的问题。在这项工作中,提出了一种端到端系统,用于场景文本检测,本地化和语言识别,以结合基于特征的方法和基于深度学习的方法。该模型使用最大稳定的极值区域和笔划宽度变换生成文本建议,然后使用生成对抗网络对建议进行细化。最后,基于卷积神经网络的模型用于检测到的场景文本的语言识别。

更新日期:2020-07-05
down
wechat
bug