CT-Net: Cascade T-shape deep fusion networks for document binarization,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

CT-Net: Cascade T-shape deep fusion networks for document binarization
Pattern Recognition ( IF 8 ) Pub Date : 2021-05-05 , DOI: 10.1016/j.patcog.2021.108010
Sheng He , Lambert Schomaker

Document binarization is a key step in most document analysis tasks. However, historical-document images usually suffer from various degradations, making this a very challenging processing stage. The performance of document image binarization has improved dramatically in recent years by the use of Convolutional Neural Networks (CNNs). In this paper, a dual-task, T-shaped neural network is proposed that has the main task of binarization and an auxiliary task of image enhancement. The neural network for enhancement learns the degradations in document images and the specific CNN-kernel features can be adapted towards the binarization task in the training process. In addition, the enhancement image can be considered as an improved version of the input image, which can be fed into the network for fine-tuning, making it possible to design a chained-cascade network (CT-Net). Experimental results on document binarization competition datasets (DIBCO datasets) and MCS dataset show that our proposed method outperforms competing state-of-the-art methods in most cases.

中文翻译：

CT-Net：用于文档二值化的级联T形深度融合网络

文档二值化是大多数文档分析任务中的关键步骤。但是，历史文献图像通常会遭受各种降级，因此这是一个非常具有挑战性的处理阶段。近年来，通过使用卷积神经网络（CNN），文档图像二值化的性能有了显着提高。本文提出了一种双任务，T形神经网络，该网络具有二值化的主要任务和图像增强的辅助任务。用于增强的神经网络可以学习文档图像中的降级，并且特定的CNN内核功能可以在训练过程中适应二值化任务。另外，可以将增强图像视为输入图像的改进版本，可以将其输入网络以进行微调，使设计连锁级联网络（CT-Net）成为可能。在文档二值化竞争数据集（DIBCO数据集）和MCS数据集上的实验结果表明，在大多数情况下，我们提出的方法优于竞争性的最新方法。

更新日期：2021-05-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>