当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Baselines Extraction from Curved Document Images via Slope Fields Recovery
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2018-12-14 , DOI: 10.1109/tpami.2018.2886900
Gaofeng Meng , Chunhong Pan , Shiming Xiang , Ying Wu

Baselines estimation is a critical preprocessing step for many tasks of document image processing and analysis. The problem is very challenging due to arbitrarily complicated page layouts and various types of image quality degradations. This paper proposes a method based on slope fields recovery for curved baseline extraction from a distorted document image captured by a hand-held camera. Our method treats the curved baselines as the solution curves of an ordinary differential equation defined on a slope field. By assuming the page shape is a smooth and developable surface, we investigate a type of intrinsic geometric constraints of baselines to estimate the latent slope field. The curved baselines are finally obtained by solving an ordinary differential equation through the Euler method. Unlike the traditional text-lines based methods, our method is free from text-lines detection and segmentation. It can exploit multiple visual cues other than horizontal text-lines available in images for baselines extraction and is quite robust to document scripts, various types of image quality degradation (e.g., image distortion, blur and non-uniform illumination), large areas of non-textual objects and complex page layouts. Extensive experiments on synthetic and real-captured document images are implemented to evaluate the performance of the proposed method.

中文翻译:

通过坡度场恢复从弯曲文档图像中提取基线

基线估计是文档图像处理和分析的许多任务的关键预处理步骤。由于任意复杂的页面布局和各种类型的图像质量下降,该问题非常具有挑战性。本文提出了一种基于斜率场恢复的方法,用于从手持式摄像机捕获的失真文档图像中提取弯曲的基线。我们的方法将弯曲的基线视为在坡度场上定义的常微分方程的解曲线。通过假设页面形状是光滑且可展开的表面,我们研究了基线的一种内在几何约束,以估计潜在的坡度场。通过使用欧拉方法求解一个常微分方程,最终获得弯曲的基线。与传统的基于文本行的方法不同,我们的方法没有文本行检测和分割。它可以利用除图像中可用的水平文本行以外的多种视觉提示进行基线提取,并且对于文档脚本,各种类型的图像质量下降(例如,图像失真,模糊和照明不均匀),非-文本对象和复杂的页面布局。在合成和实际捕获的文档图像上进行了广泛的实验,以评估该方法的性能。大范围的非文本对象和复杂的页面布局。在合成和实际捕获的文档图像上进行了广泛的实验,以评估该方法的性能。大范围的非文本对象和复杂的页面布局。在合成和实际捕获的文档图像上进行了广泛的实验,以评估该方法的性能。
更新日期:2020-03-06
down
wechat
bug