当前位置: X-MOL 学术Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Text and graphics segmentation of newspapers printed in Gurmukhi script: a hybrid approach
The Visual Computer ( IF 3.5 ) Pub Date : 2020-07-20 , DOI: 10.1007/s00371-020-01927-0
Rupinder Pal Kaur , M. K. Jindal , Munish Kumar

Newspapers are always a standard medium to convey important information to masses of people in recent time as well as in old time. An automated system is required to convert information into a processable form so that information could be searchable. Many efforts have been done on Gurmukhi script documents in typed or written form, but very few articles are present on Gurmukhi script newspaper text recognition or text and image segmentation. Image/graphics segmentation of text is mandatory before feeding newspaper text to OCR for accurate results. In the literature, many techniques have been proposed for segmenting images and text, but many are complex in nature. In this article, the authors have proposed a very simple and effective hybrid approach based on run length smoothing algorithm and projection profile to segment an image from text in Gurmukhi script newspaper articles. Both horizontal and vertical run length smearing algorithm is used for labeling the regions. Logical AND operator is applied to resultant images to identify the text and image regions. To segment the image region among the labeled regions, projection profile technique is implemented. The combination of these two techniques has produced very good results.

中文翻译:

用 Gurmukhi 脚本印刷的报纸的文本和图形分割:一种混合方法

报纸始终是近代和古时向大众传达重要信息的标准媒介。需要一个自动化系统将信息转换为可处理的形式,以便信息可以被搜索。已经对打字或书面形式的古尔穆克文文档进行了许多努力,但很少有关于古尔穆克文报纸文本识别或文本和图像分割的文章。在将报纸文本提供给 OCR 以获得准确结果之前,必须对文本进行图像/图形分割。在文献中,已经提出了许多用于分割图像和文本的技术,但许多技术本质上是复杂的。在本文中,作者提出了一种非常简单有效的混合方法,该方法基于运行长度平滑算法和投影轮廓,从 Gurmukhi 脚本报纸文章中的文本中分割图像。水平和垂直游程长度涂抹算法都用于标记区域。将逻辑 AND 运算符应用于结果图像以识别文本和图像区域。为了在标记区域中分割图像区域,实施了投影轮廓技术。这两种技术的结合产生了非常好的效果。实施了投影轮廓技术。这两种技术的结合产生了非常好的效果。实施了投影轮廓技术。这两种技术的结合产生了非常好的效果。
更新日期:2020-07-20
down
wechat
bug