Multi-scale Multi-attention Network for Moiré Document Image Binarization,Signal Processing: Image Communication

当前位置： X-MOL 学术 › Signal Process. Image Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-scale Multi-attention Network for Moiré Document Image Binarization
Signal Processing: Image Communication ( IF 3.4 ) Pub Date : 2020-11-04 , DOI: 10.1016/j.image.2020.116046
Yanqing Guo , Caijuan Ji , Xin Zheng , Qianyu Wang , Xiangyang Luo

In this paper, we propose a Multi-scale Multi-attention Network (MsMa-Net) to binarize document images contaminated by moiré patterns from camera-captured screens. Given a polluted image, MsMa-Net first learns to distinguish clean features from contaminated ones at different spatial scales via a Multi-scale feature extraction submodule (Ms-sub). In this way, detailed text information could be preserved as much as possible. Meanwhile, moiré patterns could be purified preliminarily. Then, obtained multi-scale features are adaptively interweaved through a proposed Multi-attention submodule (Ma-sub) at the channel level, the spatial level, and the correlation level, respectively. By modelling such relationships among multi-scale features, Ma-sub can further highlight text contents and suppress moiré patterns for yielding clean demoiré document images. All the demoiré images flow to a proposed Binarization submodule (Bi-sub) to produce final high-quality binarized document images. Besides, considering the scarce data support for the moiré document image binarization task, we create a new Moiré Document Image (MoDI) dataset for training and evaluating the proposed model. Extensive experiments demonstrate that MsMa-Net achieves state-of-the-art performance over several available datasets and MoDI dataset.

中文翻译：

莫尔文档图像二值化的多尺度多关注网络

在本文中，我们提出了一种多尺度多关注网络（MsMa-Net），用于对被相机捕获的屏幕上的波纹图案污染的文档图像进行二值化处理。给定受污染的图像，MsMa-Net首先学会通过多尺度特征提取子模块（Ms-sub）在不同的空间尺度上将干净的特征与受污染的特征区分开。这样，可以尽可能保留详细的文本信息。同时，波纹纹可以被初步纯化。然后，通过拟议的多关注子模块（Ma-sub）分别在信道级别，空间级别和相关级别上自适应地编织获得的多尺度特征。通过对多尺度特征之间的这种关系进行建模，Ma-sub可以进一步突出显示文本内容并抑制波纹图案，从而生成干净的脱模文档图像。所有的脱模图像都流向建议的二值化子模块（Bi-sub），以生成最终的高质量二值化文档图像。此外，考虑到对莫尔文档图像二值化任务的数据支持稀缺，我们创建了一个新的莫尔文档图像（MoDI）数据集，用于训练和评估所提出的模型。大量实验表明，MsMa-Net在几个可用的数据集和MoDI数据集上均达到了最先进的性能。

更新日期：2020-11-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文