Multi-level Fuzzy Based Renyi Entropy for Linguistic Classification of Texts in Natural Scene Images,International Journal of Fuzzy Systems

当前位置： X-MOL 学术 › Int. J. Fuzzy Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-level Fuzzy Based Renyi Entropy for Linguistic Classification of Texts in Natural Scene Images
International Journal of Fuzzy Systems ( IF 4.3 ) Pub Date : 2019-05-20 , DOI: 10.1007/s40815-019-00654-6
Angia Venkatesan Karpagam , Mohan Manikandan

This paper focuses on linguistic classification of scene texts in natural scene images. In this paper, an attempt is made to localize texts based on multi-level thresholding by fuzzy-based Renyi entropy. Complex natural scene images with diversified challenges are considered. A set of heuristic rules comprising geometric filters and stroke width transform govern the process of locating potential text regions. The scene images may contain more than one language, where text recognition by optical character recognition system becomes challenging. Manual intervention is needed to specify the language of each text. To overcome this hurdle, linguistic classification of text regions is suggested in this paper. The proposed method is validated using publicly available dataset—MSRA-TD500. Results show that fuzzy-based Renyi entropy thresholding is able to segment the foreground text from complex natural scene images. Geometric filters could capture the inherent uniformity of the text. Stroke width transform eliminates the non-text regions. The performance measures such as precision, recall and F-measures are 78%, 77% and 76%, respectively. This shows the ability of the algorithm to extract the text from the scenes. The geometric feature such as area and corner shows better variation in discriminating the linguistic texts. Further, the first three Hu moment features also contribute remarkable role in analyzing the shape of extracted text regions. The classifier based on support vector machine (SVM) yields classification accuracy of 85.45% in discriminating English and Chinese alphabets. Area under the ROC curve (AUC) is 0.851 for SVM classifier. The proposed methodology has proved its robustness against common degradations, such as uneven illumination, varying font characteristics and blurring effects. Experimental results show that our method achieves better performance in linguistic classification.

中文翻译：

基于多级模糊的Renyi熵对自然场景图像中文字的语言分类

本文重点研究自然场景图像中场景文本的语言分类。本文尝试通过基于模糊的Renyi熵的多级阈值技术对文本进行本地化。考虑具有多种挑战的复杂自然场景图像。一组包含几何过滤器和笔触宽度变换的启发式规则控制着定位潜在文本区域的过程。场景图像可能包含多种语言，通过光学字符识别系统进行文本识别变得很有挑战性。需要人工干预以指定每个文本的语言。为了克服这一障碍，本文提出了文本区域的语言分类。使用公开可用的数据集MSRA-TD500对提出的方法进行了验证。结果表明，基于模糊的人意熵阈值能够分割复杂自然场景图像中的前景文本。几何过滤器可以捕获文本的固有一致性。笔划宽度变换消除了非文本区域。绩效指标，例如精度，召回率和F值分别为78％，77％和76％。这显示了算法从场景中提取文本的能力。诸如面积和角的几何特征在区分语言文本时显示出更好的变化。此外，前三个Hu矩特征在分析提取的文本区域的形状方面也发挥了重要作用。基于支持向量机（SVM）的分类器在区分英文和中文字母时的分类精度为85.45％。对于SVM分类器，ROC曲线下的面积（AUC）为0.851。所提出的方法论证明了其对常见降级（例如照明不均，字体特征变化和模糊效果）的鲁棒性。实验结果表明，该方法在语言分类中取得了较好的效果。

更新日期：2019-05-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>