
显示样式: 排序: IF: - GO 导出
-
Translating math formula images to LaTeX sequences using deep neural networks with sequence-level training Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-11-08 Zelun Wang, Jyh-Charn Liu
In this paper, we propose a deep neural network model with an encoder–decoder architecture that translates images of math formulas into their LaTeX markup sequences. The encoder is a convolutional neural network that transforms images into a group of feature maps. To better capture the spatial relationships of math symbols, the feature maps are augmented with 2D positional encoding before being unfolded
-
Optical character recognition with neural networks and post-correction with finite state methods Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-08-20 Senka Drobac, Krister Lindén
The optical character recognition (OCR) quality of the historical part of the Finnish newspaper and journal corpus is rather low for reliable search and scientific research on the OCRed data. The estimated character error rate (CER) of the corpus, achieved with commercial software, is between 8 and 13%. There have been earlier attempts to train high-quality OCR models with open-source software, like
-
DetectGAN: GAN-based text detector for camera-captured document images Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-08-10 Jinyuan Zhao, Yanna Wang, Baihua Xiao, Cunzhao Shi, Fuxi Jia, Chunheng Wang
Nowadays, with the development of electronic devices, more and more attention has been paid to camera-based text processing. Different from scene image, the recognition system of document image needs to sort out the recognition results and store them in the structured document for the subsequent data processing. However, in document images, the fusion of text lines largely depends on their semantic
-
A robust watermarking approach for security issue of binary documents using fully convolutional networks Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-08-11 Vinh Loc Cu; Trac Nguyen; Jean-Christophe Burie; Jean-Marc Ogier
Motivated by increasing possibility of the tampering of genuine documents during a transmission over digital channels, we focus on developing a watermarking framework for determining whether a received document is genuine or falsified, which is performed by hiding a security feature or secret information within it. To begin with, the input document is transformed into a standard form to minimize geometric
-
Single shot multi-oriented text detection based on local and non-local features Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-08-04 XiaoQian Li, Jie Liu, ShuWu Zhang, GuiXuan Zhang, Yang Zheng
In order to improve the robustness of text detector on scene text of various scales, a single shot text detector that combines local and non-local features is proposed in this paper. A dilated inception module for local feature extraction and a text self-attention module for non-local feature extraction are presented, and these two kinds of modules are integrated into single shot detector (SSD) of
-
Automatic room information retrieval and classification from floor plan using linear regression model Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-07-30 Hiren K. Mewada, Amit V. Patel, Jitendra Chaudhari, Keyur Mahant, Alpesh Vala
The automatic creation of a repository of the building’s floor plan helps a lot to the architects to reuse them. The basic approach is to extract and recognize texts, symbols or graphics to retrieve the information of the floor plan from the images. This paper proposes a floor plan information retrieval algorithm. The proposed algorithm is based on shape extraction and room identification.\(\alpha
-
Model-based Persian calligraphy synthesis via learning to transfer templates to personal styles Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-06-18 Amirhossein Ahmadian; Kazim Fouladi; Babak Nadjar Araabi
Current software tools for computer generation of Persian calligraphy can be mostly described as conventional fonts and typesetting software, which basically neglect the ‘variations’ of real calligraphy performed by hand, in terms of personalization to different calligraphers’ styles, as well as their statistical characteristics. In this paper, we address the problem of natural-looking Persian calligraphy
-
Hyperkernel-based intuitionistic fuzzy c-means for denoising color archival document images Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-03-10 Walid Elhedda; Maroua Mehri; Mohamed Ali Mahjoub
In this article, we have addressed the problem of denoising and enhancement of color archival handwritten document images by separating noise from text and background. Indeed, archival document images that originated from scanning or photographing paper documents are mainly digitized in full color mode. Thus, it is necessary to preserve and exploit color information when applying an enhancement method
-
Exploiting complexity in pen- and touch-based signature biometrics Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-02-11 Ruben Tolosana; Ruben Vera-Rodriguez; Richard Guest; Julian Fierrez; Javier Ortega-Garcia
Biometric signature verification has been traditionally performed in pen-based office-like scenarios using devices specifically designed for acquiring handwriting. However, the high deployment of devices such as smartphones and tablets has given rise to new and thriving scenarios for signature biometrics where handwriting can be performed using not only a pen stylus but also the finger via touch interaction
-
Fast multi-language LSTM-based online handwriting recognition Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-02-08 Victor Carbune; Pedro Gonnet; Thomas Deselaers; Henry A. Rowley; Alexander Daryin; Marcos Calvo; Li-Lun Wang; Daniel Keysers; Sandro Feuz; Philippe Gervais
We describe an online handwriting system that is able to support 102 languages using a deep neural network architecture. This new system has completely replaced our previous segment-and-decode-based system and reduced the error rate by 20–40% relative for most languages. Further, we report new state-of-the-art results on IAM-OnDB for both the open and closed dataset setting. The system combines methods
-
A general framework for the recognition of online handwritten graphics Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2020-01-03 Frank Julca-Aguilar; Harold Mouchère; Christian Viard-Gaudin; Nina S. T. Hirata
We revisit graph grammar and graph parsing as tools for recognizing graphics. A top-down approach for parsing families of handwritten graphics containing different kinds of symbols and of structural relations is proposed. It has been tested on two distinct domains, namely the recognition of handwritten mathematical expressions and of handwritten flowcharts. In the proposed approach, a graphic is considered
-
MA-CRNN: a multi-scale attention CRNN for Chinese text line recognition in natural scenes Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-11-15 Guofeng Tong; Yong Li; Huashuai Gao; Huairong Chen; Hao Wang; Xiang Yang
The recognition methods for Chinese text lines, as an important component of optical character recognition, have been widely applied in many specific tasks. However, there are still some potential challenges: (1) lack of open Chinese text recognition dataset; (2) challenges caused by the characteristics of Chinese characters, e.g., diverse types, complex structure and various sizes; (3) difficulties
-
Efficient and effective OCR engine training Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-10-30 Christian Clausner; Apostolos Antonacopoulos; Stefan Pletschacher
We present an efficient and effective approach to train OCR engines using the Aletheia document analysis system. All components required for training are seamlessly integrated into Aletheia: training data preparation, the OCR engine’s training processes themselves, text recognition, and quantitative evaluation of the trained engine. Such a comprehensive training and evaluation system, guided through
-
Even big data is not enough: need for a novel reference modelling for forensic document authentication Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-10-14 Utpal Garain; Biswajit Halder
With the emergence of big data, deep learning (DL) approaches are becoming quite popular in many branches of science. Forensic science is no longer an exception. However, there are certain problems in forensic science where the solutions would hardly benefit from the recent advances in DL algorithms. Document authentication is one such problem where we can have many reference samples, and with the
-
An adaptive document recognition system for lettrines Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-10-10 Nhu-Van Nguyen; Mickael Coustaty; Jean-Marc Ogier
In this paper, we propose an approach to interactively propagate annotations representing the historians’ knowledge on a database of lettrine images manually populated by historians (with annotations). Based on a novel document indexing processing scheme which combines the use of the Zipf law and the use of bag of patterns, our approach extends the bag-of-words model to represent the knowledge by visual
-
Document analysis systems that improve with use Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-09-21 George Nagy
Document analysis tasks for which representative labeled training samples are available have been largely solved. The next frontier is coping with hitherto unseen formats, unusual typefaces, idiosyncratic handwriting and imperfect image acquisition. Adaptive and style-constrained classification methods can overcome some expected variability, but human intervention will remain necessary in many tasks
-
A unified method for augmented incremental recognition of online handwritten Japanese and English text Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-09-05 Cuong Tuan Nguyen; Bipin Indurkhya; Masaki Nakagawa
We present a unified method to augmented incremental recognition for online handwritten Japanese and English text, which is used for busy or on-the-fly recognition while writing, and lazy or delayed recognition after writing, without incurring long waiting times. It extends the local context for segmentation and recognition to a range of recent strokes called “segmentation scope” and “recognition scope
-
Coarse-to-fine document localization in natural scene image with regional attention and recursive corner refinement Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-08-08 Anna Zhu; Chen Zhang; Zhi Li; Shengwu Xiong
Document localization is a promising step for document-based optical character recognition. This task gains difficulty when documents are located in complex natural scene images. In this paper, we propose a coarse-to-fine document localization approach to detect the four corner points of the document in natural scene images. In the first stage, the four corners are roughly predicted through a deep
-
Handwritten Arabic text recognition using multi-stage sub-core-shape HMMs Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-08-06 Irfan Ahmad; Gernot A. Fink
In this paper, we present a multi-stage HMM-based text recognition system for handwritten Arabic. This system employs a novel way of representing Arabic characters by separating the core shapes from the diacritics and then representing these core shapes by smaller units which we term as sub-core shapes. This results in huge reductions in the number of models that need to be trained for the text recognition
-
A novel feature transform framework using deep neural network for multimodal floor plan retrieval Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-08-05 D. Sharma; N. Gupta; C. Chattopadhyay; S. Mehta
In recent past, there has been a steep increase in the use of online platforms for the search of desired products. Real estate industry is no exception and has started initiating rent/sale of houses through online platforms. In this paper, we propose a deep neural network framework to facilitate automatic search of homes based on their floor plans. The salient features of this framework are that the
-
Evaluation of word spotting under improper segmentation scenario Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-08-01 Sounak Dey; Anguelos Nicolaou; Josep Lladós; Umapada Pal
Word spotting is an important recognition task in large-scale retrieval of document collections. In most of the cases, methods are developed and evaluated assuming perfect word segmentation. In this paper, we propose an experimental framework to quantify the goodness that word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists
-
Total-Text: toward orientation robustness in scene text detection Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-08-01 Chee-Kheng Ch’ng; Chee Seng Chan; Cheng-Lin Liu
At present, text orientation is not diverse enough in the existing scene text datasets. Specifically, curve-orientated text is largely out-numbered by horizontal and multi-oriented text, hence, it has received minimal attention from the community so far. Motivated by this phenomenon, we collected a new scene text dataset, Total-Text, which emphasized on text orientations diversity. It is the first
-
Patch-based offline signature verification using one-class hierarchical deep learning Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-31 Sima Shariatmadari; Sima Emadi; Younes Akbari
Automatic processing of offline signature verification (in general) can be considered as a low-cost solution to problems in biometrics in comparison with other solutions (e. g. fingerprint, face verification, etc.). This study aims to present a novel writer-dependent approach to verifying an individual’s signature through offline image patches of their handwriting. The proposed approach is based on
-
HWNet v2: an efficient word image representation for handwritten documents Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-31 Praveen Krishnan; C. V. Jawahar
We present a framework for learning an efficient holistic representation for handwritten word images. The proposed method uses a deep convolutional neural network with traditional classification loss. The major strengths of our work lie in: (i) the efficient usage of synthetic data to pre-train a deep network, (ii) an adapted version of the ResNet-34 architecture with the region of interest pooling
-
HanFont: large-scale adaptive Hangul font recognizer using CNN and font clustering Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-31 Jinhyeok Yang; Heebeom Kim; Hyobin Kwak; Injung Kim
We propose a large-scale Hangul font recognizer that is capable of recognizing 3300 Hangul fonts. Large-scale Hangul font recognition is a challenging task. Typically, Hangul fonts are distinguished by small differences in detailed shapes, which are often ignored by the recognizer. There are additional issues in practical applications, such as the existence of almost indistinguishable fonts and the
-
An anchor-free region proposal network for Faster R-CNN-based text detection approaches Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-25 Zhuoyao Zhong; Lei Sun; Qiang Huo
The anchor mechanism of Faster R-CNN and SSD framework is considered not effective enough to scene text detection, which can be attributed to its Intersection-over-Union-based matching criterion between anchors and ground-truth boxes. In order to better enclose scene text instances of various shapes, it requires to design anchors of various scales, aspect ratios and even orientations manually, which
-
On optimal stopping strategies for text recognition in a video stream as an application of a monotone sequential decision model Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-23 Konstantin Bulatov; Nikita Razumnyi; Vladimir V. Arlazarov
The paper describes the problem of stopping the text field recognition process in a video stream, which is a novel problem, particularly relevant to real-time mobile document recognition systems. A decision-theoretic framework for this problem is provided, and similarities with existing stopping rule problems are explored. Following the theoretical works on monotone stopping rule problems, a strategy
-
A two-stage method for text line detection in historical documents Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-23 Tobias Grüning; Gundram Leifert; Tobias Strauß; Johannes Michael; Roger Labahn
This work presents a two-stage text line detection method for historical documents. Each detected text line is represented by its baseline. In a first stage, a deep neural network called ARU-Net labels pixels to belong to one of the three classes: baseline, separator and other. The separator class marks beginning and end of each text line. The ARU-Net is trainable from scratch with manageably few manually
-
Comic MTL: optimized multi-task learning for comic book image analysis Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-17 Nhu-Van Nguyen; Christophe Rigaud; Jean-Christophe Burie
Comic book image analysis methods often propose multiple algorithms or models for multiple tasks like panel and character (body and face) detection, balloon segmentation, text recognition, etc. In this work, we aim to reduce the processing time for comic book image analysis by proposing one model that can learn multiple tasks called Comic MTL instead of using one model per task. In addition to detection
-
A comparison of local features for camera-based document image retrieval and spotting Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-12 Quoc Bao Dang; Mickaël Coustaty; Muhammad Muzzamil Luqman; Jean-Marc Ogier
This paper aims at comparing robustness of local features for camera-based document image retrieval and spotting system. We present a literature review of the state of the art of local features extraction that includes keypoint detectors and keypoint descriptors. We also present a dataset and evaluation protocol for camera-based document image retrieval and spotting systems. This dataset is composed
-
Dynamic temporal residual network for sequence modeling Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-07-02 Ruijie Yan; Liangrui Peng; Shanyu Xiao; Michael T. Johnson; Shengjin Wang
The long short-term memory (LSTM) network with gating mechanism has been widely used in sequence modeling tasks including handwriting and speech recognition. As an LSTM network can be unfolded along the temporal dimension and its temporal depth is equal to the length of the input feature sequence, the introduction of gating might not be sufficient to completely model the dynamic temporal dependencies
-
Generalized framework for summarization of fixed-camera lecture videos by detecting and binarizing handwritten content Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-06-15 Bhargava Urala Kota; Kenny Davila; Alexander Stone; Srirangaraj Setlur; Venu Govindaraju
We propose a framework to extract and binarize handwritten content in lecture videos. The extracted content could potentially be used to index video collections powering content-based search and navigation within lecture videos helping students and educators across the world. A deep learning pipeline is used to detect handwritten text, formulae and sketches and then binarize the extracted content.
-
Are 2D-LSTM really dead for offline text recognition? Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-06-06 Bastien Moysset; Ronaldo Messina
There is a recent trend in handwritten text recognition with deep neural networks to replace 2D recurrent layers with 1D and in some cases even completely remove the recurrent layers, relying on simple feed-forward convolutional-only architectures. The most used type of recurrent layer is the long short-term memory (LSTM). The motivations to do so are many: there are few open-source implementations
-
Boosting scene character recognition by learning canonical forms of glyphs Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-06-04 Yizhi Wang; Zhouhui Lian; Yingmin Tang; Jianguo Xiao
As one of the fundamental problems in document analysis, scene character recognition has attracted considerable interests in recent years. But the problem is still considered to be extremely challenging due to many uncontrollable factors including glyph transformation, blur, noisy background, uneven illumination, etc. In this paper, we propose a novel methodology for boosting scene character recognition
-
A novel CNN structure for fine-grained classification of Chinese calligraphy styles Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-04-19 Jiulong Zhang; Mingtao Guo; Jianping Fan
Chinese calligraphy is a valuable cultural heritage belonging to the world. It is liked by many people, and our mission is to endeavor to pursue calligraphy and make contributions to the business with technical means. The automatic recognition of the styles of calligraphy by image processing techniques has important meaning in arts collection and auction, etc. Traditional feature operators have some
-
Bleed-through cancellation in non-rigidly misaligned recto–verso archival manuscripts based on local registration Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-04-05 Pasquale Savino; Anna Tonazzini; Luigi Bedini
Ancient manuscripts written on both pages of the sheet are frequently affected by ink bleeding from the reverse side. This phenomenon produces a significant degradation of both the foreground text and the general appearance of the manuscript. Effective digital image restoration techniques may require the use of the content of both document sides, thus needing their perfect alignment. Although often
-
Scene text detection and recognition with advances in deep learning: a survey Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-03-27 Xiyan Liu; Gaofeng Meng; Chunhong Pan
Scene text detection and recognition has become a very active research topic in recent several years. It can find many applications in reality ranging from navigation for vision-impaired people to semantic natural scene understanding. In this survey, we are intended to give a thorough and in-depth reviews on the recent advances on this topic, mainly focusing on the methods that appeared in the past
-
The achievement of higher flexibility in multiple-choice-based tests using image classification techniques Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-03-19 Mahmoud Afifi; Khaled F. Hussain
In spite of the high accuracy of the existing optical mark reading (OMR) systems and devices, a few restrictions remain existent. In this work, we aim to reduce the restrictions of multiple-choice questions (MCQ) within tests. We use an image registration technique to extract the answer boxes from answer sheets. Unlike other systems that rely on simple image processing steps to recognize the extracted
-
Real-time Kinect-based air-writing system with a novel analytical classifier Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-03-18 Shahram Mohammadi; Reza Maleki
Air-writing is an attractive method of interaction between human and machine due to lack of any interface device on the user side. After removing existing limitations and solving the current challenges, it can be used in many applications in the future. In this paper, using the Kinect depth and color images, an air-writing system is proposed to identify single characters such as digits or letters and
-
Metro maps for efficient knowledge learning by summarizing massive electronic textbooks Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-03-14 Weiming Lu; Pengkun Ma; Jiale Yu; Yangfan Zhou; Baogang Wei
As the number of textbooks soars, people may be stuck into thousands of books when learning knowledge. In order to provide a concise yet comprehensive picture for learning, we propose a novel framework, called MM4Books, to automatically build metro maps for efficient knowledge learning by summarizing massive electronic textbooks. We represent each book in digital libraries as a sequence of chapters
-
Plug-and-play approach to class-adapted blind image deblurring Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-03-02 Marina Ljubenović; Mário A. T. Figueiredo
Most of the existing single-image blind deblurring methods are tailored for natural images. However, in many important applications (e.g., document analysis, forensics), the image being recovered belongs to a specific class (e.g., text, faces, fingerprints) or contains two or more classes. To deal with these images, we propose a class-adapted blind deblurring framework, based on the plug-and-play scheme
-
A framework for information extraction from tables in biomedical literature Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-02-15 Nikola Milosevic; Cassie Gregson; Robert Hernandez; Goran Nenadic
The scientific literature is growing exponentially, and professionals are no more able to cope with the current amount of publications. Text mining provided in the past methods to retrieve and extract information from text; however, most of these approaches ignored tables and figures. The research done in mining table data still does not have an integrated approach for mining that would consider all
-
Online signature verification based on string edit distance Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-02-11 Kaspar Riesen; Roman Schmidt
Handwritten signatures are widely used and well-accepted biometrics for personal authentication. The accuracy of signature verification systems has significantly improved in the last decade, making it possible to rely on machines in particular cases or to support human experts. Yet, based on only few genuine references, signature verification is still a challenging task. The present paper provides
-
Stroke order normalization for improving recognition of online handwritten mathematical expressions Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2019-01-17 Anh Duc Le; Hai Dai Nguyen; Bipin Indurkhya; Masaki Nakagawa
We present a technique based on stroke order normalization for improving recognition of online handwritten mathematical expressions (ME). The stroke order dependent system has less time complexity than the stroke order free system, but it must incorporate special grammar rules to cope with stroke order variations. The stroke order normalization technique solves this problem and also the problem of
-
An improved discriminative region selection methodology for online handwriting recognition Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-11-16 Subhasis Mandal; S. R. Mahadeva Prasanna; Suresh Sundaram
The task of online handwriting recognition (HR) becomes often challenging due to the presence of confusing characters which are separable by a small region. To address this problem, we propose a “discriminative region (DR) selection” technique which highlights the discriminative region that distinguishes one character from another similar character. The existing DR selection approach for online handwriting
-
A comparative study of delayed stroke handling approaches in online handwriting Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-11-13 Esma F. Bilgin Tasdemir; Berrin Yanikoglu
Delayed strokes, such as i-dots and t-crosses, cause a challenge in online handwriting recognition by introducing an extra source of variation in the sequence order of the handwritten input. The problem is especially relevant for languages where delayed strokes are abundant and training data are limited. Studies for handling delayed strokes have mainly focused on Arabic and Farsi scripts where the
-
KERTAS: dataset for automatic dating of ancient Arabic manuscripts Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-09-08 Kalthoum Adam; Asim Baig; Somaya Al-Maadeed; Ahmed Bouridane; Sherine El-Menshawy
The age of a historical manuscript can be an invaluable source of information for paleographers and historians. The process of automatic manuscript age detection has inherent complexities, which are compounded by the lack of suitable datasets for algorithm testing. This paper presents a dataset of historical handwritten Arabic manuscripts designed specifically to test state-of-the-art authorship and
-
Building efficient CNN architecture for offline handwritten Chinese character recognition Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-08-29 Zhiyuan Li; Nanjun Teng; Min Jin; Huaxiang Lu
Deep convolutional neural networks-based methods have brought great breakthrough in image classification, which provides an end-to-end solution for handwritten Chinese character recognition (HCCR) problem through learning discriminative features automatically. Nevertheless, state-of-the-art CNNs appear to incur huge computational cost and require the storage of a large number of parameters especially
-
A combined strategy of analysis for the localization of heterogeneous form fields in ancient pre-printed records Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-07-26 Aurélie Lemaitre; Jean Camillerapp; Cérès Carton; Bertrand Coüasnon
This paper deals with the location of handwritten fields in old pre-printed registers. The images present the difficulties of old and damaged documents, and we also have to face the difficulty of extracting the text due to the great interaction between handwritten and printed writing. In addition, in many collections, the structure of the forms varies according to the origin of the documents. This
-
Integrating scattering feature maps with convolutional neural networks for Malayalam handwritten character recognition Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-07-02 K. Manjusha; M. Anand Kumar; K. P. Soman
Convolutional neural network (CNN)-based deep learning architectures are the state-of-the-art in image-based pattern recognition applications. The receptive filter fields in convolutional layers are learned from training data patterns automatically during classifier learning. There are number of well-defined, well-studied and proven filters in the literature that can extract informative content from
-
Augmented incremental recognition of online handwritten mathematical expressions Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-06-16 Khanh Minh Phan; Anh Duc Le; Bipin Indurkhya; Masaki Nakagawa
This paper presents an augmented incremental recognition method for online handwritten mathematical expressions (MEs). If an ME is recognized after all strokes are written (batch recognition), the waiting time increases significantly when the ME becomes longer. On the other hand, the pure incremental recognition method recognizes an ME whenever a new single stroke is input. It shortens the waiting
-
A comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese text recognition Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-06-15 Zi-Rui Wang; Jun Du; Wen-Chao Wang; Jian-Fang Zhai; Jin-Shui Hu
This paper proposes an effective segmentation-free approach using a hybrid neural network hidden Markov model (NN-HMM) for offline handwritten Chinese text recognition (HCTR). In the general Bayesian framework, the handwritten Chinese text line is sequentially modeled by HMMs with each representing one character class, while the NN-based classifier is adopted to calculate the posterior probability
-
Learning to detect, localize and recognize many text objects in document images from few examples Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-06-09 Bastien Moysset; Christopher Kermorvant; Christian Wolf
The current trend in object detection and localization is to learn predictions with high capacity deep neural networks trained on a very large amount of annotated data and using a high amount of processing power. In this work, we particularly target the detection of text in document images and we propose a new neural model which directly predicts object coordinates. The particularity of our contribution
-
Fully convolutional network with dilated convolutions for handwritten text line segmentation Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-05-30 Guillaume Renton; Yann Soullard; Clément Chatelain; Sébastien Adam; Christopher Kermorvant; Thierry Paquet
We present a learning-based method for handwritten text line segmentation in document images. Our approach relies on a variant of deep fully convolutional networks (FCNs) with dilated convolutions. Dilated convolutions allow to never reduce the input resolution and produce a pixel-level labeling. The FCN is trained to identify X-height labeling as text line representation, which has many advantages
-
Fusion of LLE and stochastic LEM for Persian handwritten digits recognition Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-05-10 Rassoul Hajizadeh; A. Aghagolzadeh; M. Ezoji
In this paper, a new local manifold learning (ML) method is proposed. Our proposed method, which is named FSLL, is based on the fusion of locally linear embedding (LLE) and a new Stochastic Laplacian Eigenmaps (SLEM). SLEM is the same as a common LEM technique, but the coefficients between each data point and its neighbors are calculated by a stochastic process. The coefficients of SLEM make a probability
-
Recognition-based character segmentation for multi-level writing style Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-04-30 Papangkorn Inkeaw; Jakramate Bootkrajang; Phasit Charoenkwan; Sanparith Marukatat; Shinn-Ying Ho; Jeerayut Chaijaruwanich
Character segmentation is an important task in optical character recognition (OCR). The quality of any OCR system is highly dependent on character segmentation algorithm. Despite the availability of various character segmentation methods proposed to date, existing methods cannot satisfyingly segment characters belonging to some complex writing styles such as the Lanna Dhamma characters. In this paper
-
Text box proposals for handwritten word spotting from documents Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-04-27 Suman Ghosh; Ernest Valveny
In this article, we propose a new approach to segmentation-free word spotting that is based on the combination of three different contributions. Firstly, inspired by the success of bounding box proposal algorithms in object recognition, we propose a scheme to generate a set of word-independent text box proposals. For that, we generate a set of atomic bounding boxes based on simple connected component
-
Fixed-sized representation learning from offline handwritten signatures of different sizes Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-04-23 Luiz G. Hafemann; Luiz S. Oliveira; Robert Sabourin
Methods for learning feature representations for offline handwritten signature verification have been successfully proposed in recent literature, using deep convolutional neural networks to learn representations from signature pixels. Such methods reported large performance improvements compared to handcrafted feature extractors. However, they also introduced an important constraint: the inputs to
-
Binarization of degraded document images based on contrast enhancement Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-04-06 Di Lu; Xin Huang; LiXue Sui
Because of the different types of document degradation such as uneven illumination, image contrast variation, blur caused by humidity, and bleed-through, degraded document image binarization is still an enormous challenge. This paper presents a new binarization method for degraded document images. The proposed algorithm focuses on the differences of image grayscale contrast in different areas. Quadtree
-
A novel Arabic OCR post-processing using rule-based and word context techniques Int. J. Doc. Anal. Recognit. (IF 1.486) Pub Date : 2018-04-05 Iyad Abu Doush; Faisal Alkhateeb; Anwaar Hamdi Gharaibeh
Optical character recognition (OCR) is the process of recognizing characters automatically from scanned documents for editing, indexing, searching, and reducing the storage space. The resulted text from the OCR usually does not match the text in the original document. In order to minimize the number of incorrect words in the obtained text, OCR post-processing approaches can be used. Correcting OCR