当前位置: X-MOL 学术Int. J. Doc. Anal. Recognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A two-step framework for text line segmentation in historical Arabic and Latin document images
International Journal on Document Analysis and Recognition ( IF 1.8 ) Pub Date : 2021-06-11 , DOI: 10.1007/s10032-021-00377-1
Olfa Mechi , Maroua Mehri , Rolf Ingold , Najoua Essoukri Ben Amara

One of the most important preliminary tasks in a transcription system of historical document images is text line segmentation. Nevertheless, this task remains complex due to the idiosyncrasies of ancient document images. In this article, we present a complete framework for text line segmentation in historical Arabic or Latin document images. A two-step procedure is described. First, a deep fully convolutional networks (FCN) architecture has been applied to extract the main area covering the text core. In order to select the highest performing FCN architecture, a thorough performance benchmarking of the most recent and widely used FCN architectures for segmenting text lines in historical Arabic or Latin document images has been conducted. Then, a post-processing step, which is based on topological structure analysis is introduced to extract complete text lines (including the ascender and descender components). This second step aims at refining the obtained FCN results and at providing sufficient information for text recognition. Our experiments have been carried out using a large number of Arabic and Latin document images collected from the Tunisian national archives as well as other benchmark datasets. Quantitative and qualitative assessments are reported in order to firstly pinpoint the strengths and weaknesses of the different FCN architectures and secondly to illustrate the effectiveness of the proposed post-processing method.



中文翻译:

历史阿拉伯语和拉丁语文档图像中文本行分割的两步框架

历史文档图像转录系统中最重要的初步任务之一是文本行分割。然而,由于古代文档图像的特性,这项任务仍然很复杂。在本文中,我们提出了一个完整的框架,用于在历史阿拉伯语或拉丁语文档图像中进行文本行分割。描述了一个两步过程。首先,已应用深度全卷积网络 (FCN) 架构来提取覆盖文本核心的主要区域。为了选择性能最高的 FCN 架构,已经对最新和广泛使用的 FCN 架构进行了全面的性能基准测试,用于分割历史阿拉伯语或拉丁文文档图像中的文本行。然后,一个后处理步骤,引入基于拓扑结构分析的完整文本行(包括上行部分和下行部分)。第二步旨在完善获得的 FCN 结果并为文本识别提供足够的信息。我们的实验是使用从突尼斯国家档案馆以及其他基准数据集收集的大量阿拉伯语和拉丁语文档图像进行的。报告了定量和定性评估,以首先查明不同 FCN 架构的优缺点,其次说明所提出的后处理方法的有效性。我们的实验是使用从突尼斯国家档案馆以及其他基准数据集收集的大量阿拉伯语和拉丁语文档图像进行的。报告了定量和定性评估,以首先查明不同 FCN 架构的优缺点,其次说明所提出的后处理方法的有效性。我们的实验是使用从突尼斯国家档案馆以及其他基准数据集收集的大量阿拉伯语和拉丁语文档图像进行的。报告了定量和定性评估,以首先查明不同 FCN 架构的优缺点,其次说明所提出的后处理方法的有效性。

更新日期:2021-06-13
down
wechat
bug