当前位置: X-MOL 学术Big Data Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DIGITNET: A Deep Handwritten Digit Detection and Recognition Method Using a New Historical Handwritten Digit Dataset
Big Data Research ( IF 3.5 ) Pub Date : 2020-12-28 , DOI: 10.1016/j.bdr.2020.100182
Huseyin Kusetogullari , Amir Yavariabdi , Johan Hall , Niklas Lavesson

This paper introduces a novel deep learning architecture, named DIGITNET, and a large-scale digit dataset, named DIDA, to detect and recognize handwritten digits in historical document images written in the nineteen century. To generate the DIDA dataset, digit images are collected from 100,000 Swedish handwritten historical document images, which were written by different priests with different handwriting styles. This dataset contains three sub-datasets including single digit, large-scale bounding box annotated multi-digit, and digit string with 250,000, 25,000, and 200,000 samples in Red-Green-Blue (RGB) color spaces, respectively. Moreover, DIDA is used to train the DIGITNET network, which consists of two deep learning architectures, called DIGITNET-dect and DIGITNET-rec, respectively, to isolate digits and recognize digit strings in historical handwritten documents. In DIGITNET-dect architecture, to extract features from digits, three residual units where each residual unit has three convolution neural network structures are used and then a detection strategy based on You Look Only Once (YOLO) algorithm is employed to detect handwritten digits at two different scales. In DIGITNET-rec, the detected isolated digits are passed through 3 different designed Convolutional Neural Network (CNN) architectures and then the classification results of three different CNNs are combined using a voting scheme to recognize digit strings. The proposed model is also trained with various existing handwritten digit datasets and then validated over historical handwritten digit strings. The experimental results show that the proposed architecture trained with DIDA (publicly available from: https://didadataset.github.io/DIDA/) outperforms the state-of-the-art methods.



中文翻译:

DIGITNET:使用新的历史手写数字数据集的深度手写数字检测和识别方法

本文介绍了一种名为DIGITNET的新型深度学习体系结构,以及一个名为DIDA的大规模数字数据集,用于检测和识别19世纪书写的历史文档图像中的手写数字。要生成DIDA数据集,请从以下位置收集数字图像100000瑞典手写的历史文件图像,由不同的牧师以不同的笔迹样式书写。此数据集包含三个子数据集,包括单个数字,带大批注数字的大型边界框以及带有以下内容的数字字符串:25000025000200000分别在Red-Green-Blue(RGB)颜色空间中采样。此外,DIDA用于训练DIGITNET网络,该网络由两个深度学习架构组成,分别称为DIGITNET-dect和DIGITNET-rec,以隔离数字并识别历史手写文档中的数字字符串。在DIGITNET-dect架构中,要从数字中提取特征,将使用三个残差单元,其中每个残差单元具有三个卷积神经网络结构,然后采用基于“一次只看一次”(YOLO)算法的检测策略来检测两个数字的手写数字。不同的规模。在DIGITNET-rec中,检测到的孤立数字将通过3种不同的设计的卷积神经网络(CNN)架构传递,然后使用表决方案组合三个不同CNN的分类结果以识别数字字符串。还使用各种现有的手写数字数据集对提出的模型进行了训练,然后对历史手写数字字符串进行了验证。实验结果表明,用DIDA训练的拟议架构(可从https://didadataset.github.io/DIDA/公开获得)优于最新方法。

更新日期:2021-01-06
down
wechat
bug